how to describe a dataset in a report

The dataset can be in the form of a NumPy array list tuple or similar data structure. Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support. The delimiter between values in the file. For more information, see Data Method Selector. You can create a unique key with the following options: The following example creates a unique key for a CSV file: dataset/mydataset/!{iotanalytics:scheduleTime}/!{iotanalytics:versionId}.csv. While Plotly has become the leader for interactive visualizations, because it stores the data for each plot generated in the users browser session and renders every interactive data point, it can really slow down the load time of your report if you are working with multiple plots or with a very large dataset, which negatively impacts the readers experience. A data set is a collection of numbers or values that relate to a particular subject. The values of variables used in the context of the execution of the containerized application (basically, parameters passed to the application). In order that the next record type in a dataset be known (so that its length is known as well), the order of the records is fixed. >> When the Dynamic Filter property is set to False, the SrsReportDataContract object will not contain the query contract. Raw data is a term used to describe data in its most basic digital format. If not specified or set to null, only the latest version plus the latest succeeded version (if they are different) are kept for the time period specified by the retentionPeriod parameter. For example, if you remove a field in the query and the field displays in the report, the report will display an empty column for the field. This will create an auto design for the report displaying the data for the dataset. installation instructions For each SSL connection, the AWS CLI will verify SSL certificates. Each rule must contain at least one column and at least one user or group. more bars are towards left or right respectively which can be essentials in making conclusions. Although usually digital, research data also includes non-digital formats such as laboratory notebooks and diaries. Analytic applications and data scientists can then review the results to uncover patterns and enable business leaders to draw informed insights. A data set is different from a data warehouse, data lake, and data mill because it focuses on a much narrower topic. You are viewing the documentation for an older major version of the AWS CLI (version 1). It is always best to keep your report as clear and concise as possible. Probably not a classic match! Upload a Power BI Desktop file that contains a model. << This option overrides the default behavior of verifying SSL certificates. 4 0 obj sysuse auto, clear. Creating Animated Data Visualisations in Python. 25%/50%/75% what value accounts for 25%/50%/75% of the data. Key pieces of information include: The source of the dataset A list and explanation of variables in the dataset. 13 0 obj Descriptive statistics include those that summarize the central tendency, dispersion and shape of a dataset's distribution, excluding NaN values. >> Any data team can create an analytics report, but not all are creating actionable reports. The default value is 60 seconds. An option that controls whether a child dataset of a direct query can use this dataset as a source. When this value is True, a dataset parameter and report parameter will be created. The name of the dataset whose latest contents are used as input to the notebook or application. Newer communication datasets contain patient symptom reports, real-time audiofiles of visits, coded communication data, and visit outcomes. Your two main options are Bokeh and plotly. For static plotting or for very unique or customised plots, where you may need to build your own solution, matplotlib and seaborn are your answers. If the data method that you select has parameters and you have not yet specified values for the parameters, you will be prompted to enter values for the parameters. When describing trends in a report you need to pay careful attention to the use of prepositions: Sales in the UK increased rapidly between 2007 and 2010. endobj How you generated your graphs is not important for these users, only the visuals and the insights they display are. /Type /Annot << /BS<> from arcgis import geoanalytics as ga We can get an idea of the shape and spread of the continuous data through a histogram. Use the subset to understand how your big data appears when added to a map or visualized in an attribute table. Think of describe, replace as separate from but related to describe without the replace option. Quantitative data is in numeric form , which can be discrete that includes finite numerical values or continuous which also takes fractional values apart from finite values. list The Amazon Resource Name (ARN) of the data source. What is the difference between c-chart and u-chart? Groupings of columns that work together in certain Amazon QuickSight features. Say we want to see the speed with which cars drive in a city. In this case the height or length of the bar indicates the measured value or frequency. /Parent 24 0 R But if you are told that the relative frequency is 0.8 which means 80% of students hailed from Bangalore, you might get an idea that students from this city are much more inclined for data science than other cities. The below histogram shows that Indias GDP has increased consistently in the last decade. Create a legend if necessary. The column schema from the SQL query result set. extent_output=true, output_name="Hurricanes_describe") . So rather we use range like 5060 km/h and so on to plot histogram which also uses bars to graphically represent the data, but here each bar group is a range. Mean what is the mean average. Power BI datasets represent a source of data that's ready for reporting and visualization. This could be due to a parameter that was passed to the query and is not recommended. What is a dataset in report? /Subtype/Link/A<> The ARN of the role that gives permission to the system to access required resources to run the, The type of the compute resource used to execute the, The size, in GB, of the persistent storage available to the resource instance used to execute the. --cli-input-json (string) This can help us get an idea of whether there are patterns in the data. We were able to get results about our data in general, but then get more detailed insights by using '.groupby()' to group our data by referee. The ARN of the role that grants IoT Analytics permission to interact with your Amazon S3 and Glue resources. A Medium publication sharing concepts, ideas and codes. Your introduction should lay out the objective, problem definition and the methods employed. Check the skewness in the data whether it is left skewed or right skewed i.e. How do you describe a data set? ",]XYkm&b}Ao4H?OcfSAbdz$U(U,J%Ta#<2 Did you find this page useful? Data methods that do not return a System.Data.DataTable will not display in the window. Descriptive statistics is essentially describing the data through methods such as graphical representations, measures of central tendency and measures of variability. Lets use .groupby() to create a dataset grouped by the Referee column. 38 fouls, 3 yellows and a red between Mazzarris Watford and Stoke. A data report is an analytical tool used to display past present and future data to efficiently track and optimize the performance of a company. In this lesson you will learn how to describe a data set by using characteristics of the quantity measured. endobj For instance, a qualitative data which contains gender of students in a class, a suitable method to describe would be frequency of males and females. The Contents of the GROUP Data Set. Bar graphs are used to show relationships between different data series that are independent of each other. The element you can use to define tags for row-level security. {iotanalytics:versionId} to specify the key, you might get duplicate keys. A DatasetAction object that specifies how dataset contents are automatically created. GeoAnalytics Tools and standard feature analysis tools in ArcGIS Enterprise have different parameters and capabilities. For the latest documentation, see Microsoft Dynamics 365 product documentation. The data is provided as is for others to reuse eg. Provide a table of contents, use headings and subheadings accordingly, which gives readers an overview and will help orientate them as they read through the post this is particularly important if your content is complex. We were able to get results about our data in general, but then get more detailed insights by using '.groupby ()' to group our data by referee. The region to use. The idea of a dataset description is to convey basic information about the data assembly and wrangling process (without dragging the audience through all the pain you experienced). This is a variant type structure. We were able to get results about our data in general, but then get more detailed insights by using .groupby() to group our data by referee. A list of data rules that send notifications to CloudWatch, when data arrives late. An Glue Data Catalog table contains partitioned data and descriptions of data sources and targets. The namespace associated with the dataset that contains permissions for RLS. The Describe Dataset tool provides an overview of your big data. And there we have it! With inferential statistics, you take data from samples and make generalizations about a population. Sometimes, it is better to represent a data by taking range of values rather than every individual value. The Quick Answer: Pandas describe Provides Helpful Summary Statistics This example describes a hurricane tracking dataset in a big data file share and outputs a subset of 200 hurricane features and an extent layer.# Import the required ArcGIS API for Python modules It summarizes the data in a meaningful way which enables us to generate insights from it. An expression that must evaluate to a Boolean value. This structure is also sometimes referred to as a data frame. For this structure to be valid, only one of the attributes can be non-null. A string that you want to use to delimit the values when you pass the values at run time. Have a beginning, middle, and end. /Rect [226.668 516.878 259.922 523.129] The query string that is used to retrieve the data for the dataset based on the given data source and data source type. The data set consists of data of one or more members corresponding to each row. The Describe Dataset tool provides an overview of your big data. To use the following examples, you must have the AWS CLI installed and configured. AWS CLI version 2, the latest major version of AWS CLI, is now stable and recommended for general use. The maximum socket read time in seconds. The output will vary depending on what is provided. The greater the bin size, the lesser would be the bins and the lesser granular would be the analysis as it would represent lesser details. >> Overrides config/env settings. endobj Fouls and red cards are also very close between both teams. The x-axis displays the values in the dataset and the y-axis shows the frequency of each value. Descriptive statistics consists of two basic categories of measures: measures of central tendency and measures of variability (or spread). If other arguments are provided on the command line, the CLI values will override the JSON-provided values. Note: A physical table type for an S3 data source. For more information, see DescribeDataset in the AWS IoT Analytics API Reference. A FieldFolder element is a folder that contains fields and nested subfolders. You are viewing the documentation for an older major version of the AWS CLI (version 1). For instance, based on a dataset's description, users may decide to explore reports that are based on the dataset, or to create their own reports based on the dataset. Lets get started by firing up a season-long dataset of referees and their cards given in each game last season. As with any other type of content, your reports should follow a specific structure, which will help the reader frame the reports contents & thus facilitate an easier read. This option overrides the default behavior of verifying SSL certificates. Performs service operation based on the JSON string provided. /Type /Annot Optionally, the tool can output a feature layer representing a sample of your input features, or a single polygon feature layer that represents the extent of your input features. If its for a sales lead, emphasise the core metrics by which their department evaluates performance. When dataset contents are created they are delivered to destinations specified here. The user or group rules associated with the dataset that contains permissions for RLS. A record is defined to be a series of bytes which are to be read or written together. This geoprocessing tool is available with ArcGIS Enterprise 10.7 or later. Both are really powerful declarative libraries and worth considering. Pick the chart, graph or table that best fits with the paragraph and move on to the next point. /Rect [69.272 526.23 159.362 534.143] /Rect [226.668 526.23 279.454 534.143] Configuration information for coordination with Glue, a fully managed extract, transform and load (ETL) service. Therefore, databases are typically larger and contain a lot more information than a data set. Understand attribute values with summarized field statistics. Try not to break-up the flow of the report with too many graphics that essentially show the same thing. The column tags to remove from this column. The name of this column in the underlying data source. Often a data set or collection contains a large number of files perhaps organized into a number of directories or database tables. Business Logic to use a data method defined in a reporting project. help getting started. This may be a brief list of variable names; /Filter /FlateDecode For more information, see How to: Create a Precision Design for a Report. If you plan to use a data source other than the predefined Microsoft Dynamics AX data source, you must first define the data source in Model Editor before defining the dataset. For example, you might have two dataset contents with the same scheduleTime but different versionId s. This means that one dataset content overwrites the other. It measures the number of times an observation occurs in the data. The following are example uses of the tool: Browse to the tabular, point, line, or area feature layer or big data file share dataset you want to describe using the Choose dataset to describe option. The structure of Ant 13 dataset is same as Example 1. Is Clostridium difficile Gram-positive or negative? processed_map = portal.map() Determine the logical structure of your argument. /u0:E#pzs#(\*\ye}Y{4k. If your report uses the predefined Dynamics AX data source and a query that is defined in the AOT in Microsoft Dynamics AX, you must be especially careful when updating the query in the AOT. >> And finally, last but certainly not least, add an appropriate title, description, tags, and preview image. In this case, the bins represent salary which is a huge number, so it is meaningful to have bin size larger in this case say $50,000 or $100,000. Lesson Standard - CCSS6SPB5b. The default value is 60 seconds. Visualize your big data with a sample layer. Can Machine Learning Design a Football Kit? In the Properties window, set the following properties. This functionality is currently only supported in Map Viewer Classic (formerly known as Map Viewer). These examples will need to be adapted to your terminal's quoting rules. If the Data Source Type property is set to Report Data Provider, click the ellipses button () to open a dialog box where you can select a class from the AOT. /Contents 14 0 R A set of one or more definitions of a `` ColumnLevelPermissionRule `` . Sort Option indicates whether PROC SORT used the NODUPKEY or NODUPREC option when sorting the data set. A time interval. Verify that you correctly registered time and geometry with your big data file share. Specify a value for this property if you plan to use the dataset in an auto design report. No festive spirit in this Boxing Day scrap, either way. The ID for the dataset that you want to create. The maximum socket connect time in seconds. Whenever you make updates to a query, be sure to consider how those updates may affect your reports. The data can be both quantitative and qualitative in nature. >> This ID is unique per Amazon Web Services Region for each Amazon Web Services account. An expression by which the time of the message data might be determined. A structure that contains the name and configuration information of a late data rule. Descriptive statistics is essentially describing the data through methods such as graphical representations measures of central tendency and measures of variability. While a monthly range would contain more details but might be cumbersome to analyse. To provide a description for a dataset, go to dataset's settings page, find the Dataset description section, and enter your description in the text box. Most datasets can be located by identifying the agency or organization that focuses on a specific research area of interest. --cli-input-json (string) Or for a data on age of people, we might want to know the frequency of various age groups for which we could classify the data into a continuous data to construct a frequency distribution as shown below. 1 Overview Describe the problem. /BS<> The DatasetTrigger objects that specify when the dataset is automatically updated. 8 0 obj Data science defined Data science encompasses preparing data for analysis, including cleansing, aggregating, and manipulating the data to perform advanced data analysis. Altair is another (newer) declarative, lightweight, plotting library, and merits consideration. 9 0 obj /BS<> /Type /Annot The drop-down list contains the Microsoft Dynamics AX base and system enums. These were some of the ways through which we can describe the data. An option that controls whether a child dataset that's stored in QuickSight can use this dataset as a source. /Rect [370.21 612.261 419.041 621.265] Providing a meaningful description helps foster dataset reuse. The sample layer does not represent a truly random geographic selection and should not be used to understand the geographic extent or distribution of your data. For more information about data region types, see Report Data Region Overview. Median: This is the middle value of the observations. To view this page for the AWS CLI version 2, click Turn on debug logging. Scientific data is defined as information collected using specific methods for a specific purpose of studying or analyzing. /Rect [69.272 537.143 207.54 545.102] If enabled, the status is, The status of row-level security tags. The DatasetAction objects that automatically create the dataset contents. Table 2.1 shows a dataset. A good outline is. Like China and India are most populous and thus the absolute number night be far greater in them. >> | Privacy | Manage Cookies | Legal, It will be available in a future release of the /Subtype /Link The following are examples of what you can do with the Describe Dataset tool: Verify that you have correctly registered time and geometry with your big data file share. << Total Number or N, Mean, Median, Mode and Standard Deviation are used to describe your data. This is 0 if the dataset isn't imported into SPICE. In the settings page, find the Dataset description section, and enter your description in the text box. https://padhai.onefourthlabs.in/courses/data-science. help getting started. You can create Power BI datasets in the following ways: Connect to an existing data model that isn't hosted in Power BI. CMO & Data Science at Kyso. Mode: This is the most frequent observation. endobj A report dataset identifies data that is displayed in a report. Till now we saw how we can describe a variable using the number of times they occur in the data. If you create a precision design report, the dataset will be available in the Report Data window for SQL Report Designer. len() will tell us. The following soil depth example outlines how statistics are calculated for each field type: These example input features will be summarized and output as the calculated statistics below. If your report uses an OLAP data source and the MDX query has the ability to return a variable number of data columns, your report may show empty columns. This neednt be long, but it should be clear. /Rect [69.272 516.878 111.81 523.129] It is not possible to pass arbitrary binary values using a JSON-provided value as the string will be taken literally. During a dataset update, if the column ID of a calculated column matches that of an existing calculated column, Amazon QuickSight preserves the existing calculated column. Introduction to Images and Procedural Generation in Python, Mean what is the mean average? The means of retrieving data from the data source. Copyright 2022 Esri. Some time the collected dataset is tidy and mostly it is not. /Subtype /Link Fields will have different statistics output depending on the field type. 11 0 obj Research data is any information that has been collected, observed, generated or created to validate original research findings. In Model Editor, expand the node for the report that you want to work with. << For example, the test scores of each student in a particular class is a data set. Scatter plot is a very basic and essential way of doing that. A dataset is a collection of data published or curated by a single source and available for access or download in one or more formats The instances of the dataset available for access or download in one or more formats are called distributions. Median of a dataset is the middle value of the collection of data when arranged in ascending order and descending order. Next steps Data hub Questions? Optional. Performs service operation based on the JSON string provided. The Amazon Web Services request ID for this operation. What are the differences between a male and a hermaphrodite C. elegans? Python Pandas Dataframe Describe Method Geeksforgeeks, Often a data set or collection contains a large number of f, Which of the following is an example of a value. Right-click the Datasets node, and then click Add Dataset. Rather than producing a written report, describe, replace produces a new dataset that contains the same information a written report would. Each variable must have a name and a value given by one of stringValue , datasetContentVersionValue , or outputFileUriValue . >> DisableUseAsDirectQuerySource -> (boolean). Measures of central tendency describe the center of a data set. endobj (Sum of values/count of values). 5) Feasibility report: An exploratory report to determine whether an idea will work. The absolute number might give vague view but if you divide the absolute number by total number of students enrolled, you will get what we call as relative frequency. Give us feedback. The JSON string includes the following information: Learn more about time settings and big data file share datasets, Learn more about geometry settings and big data file share datasets. On average, we have between 3 and 4 yellows in a game and that the away team are only slightly more likely to get more cards. This will give us a whole new table of statistics for each numerical column: So what do we learn? The values in this set are known as a datum. In some plots, the variable can be so positively related, it would look like almost a linear relationship. If we have a normal distribution, 68% of our values will be within one STD either side of the average. However, it will still display some descriptive statistics: Take a look at the team_id and fran_id columns. The easiest way to produce an en-masse summary of our dataset is with the describe method. A value that indicates whether you want to import the data into SPICE. The count of [0, 1, 10, 5, null, 6] = 5. The facts and figures in your report should speak for themselves without the need for any exaggeration. Data should answer the research questions identified earlier. Provide some background to the topic in question if necessary & explain why you are writing the post. A transform operation that casts a column to a different type. 40 0 obj >> >> The amount of SPICE capacity used by this dataset. Whether the file has a header row, or the files each have a header row. Normal distribution, 68 % of the average the ARN of the dataset without the replace option,! Design report methods such as graphical representations, measures of variability the time of the attributes can be.! New table of statistics for each numerical column: So what do we learn application ) created! Is currently only supported in Map Viewer Classic ( formerly known as a source of data that & x27. Altair is another ( newer ) declarative, lightweight, plotting library, and merits consideration of each.! Window, set the following examples, you must have a normal distribution, 68 % of our values be... That send notifications to CloudWatch, when data arrives late doing that report should speak for without! Amazon QuickSight features be created rather than producing a written report, but not all creating. About a population into a number of directories or database tables report parameter will be created using the of. Is any information that has been collected, observed, generated or created to validate original research findings methods a... Portal.Map ( ) Determine the logical structure of Ant 13 dataset is n't imported SPICE... Updates may affect your reports Dynamic Filter property is set to False, the values. Also sometimes referred to as a datum and India are most populous and thus the absolute night... Latest major version of the average this structure to be read or written together the and. Or table that best fits with the describe method that controls whether a dataset. That Indias GDP has increased consistently in the data through methods such as graphical representations, of... Whether the file has a header row, or outputFileUriValue System.Data.DataTable will display! The column schema from the data Power BI datasets represent a data set consists of two categories... Context of the attributes can be in the AWS CLI version 2, the CLI will. Different statistics output depending on what is provided from but related to describe data in most... Like almost a linear relationship relate to a particular class is a very basic and essential way doing... An older major version of the data your report should speak for themselves the! In them observed, generated or created to validate original research findings logical structure of Ant 13 dataset with... Structure that contains fields and nested subfolders the test scores of each in! Written together as input to the topic in question if necessary & explain why you are viewing documentation. Partitioned data and descriptions of data sources and targets a Map or visualized in an design! Cli ( version 1 ) basic categories of measures: measures of variability ( or spread ) independent. Certain Amazon QuickSight features fouls and red cards are also very close between both teams per Amazon Web request! In some plots, the status is, the test scores of each other show the same thing the will... 69.272 537.143 207.54 545.102 ] if enabled, the test scores of each.. Is always best to keep your report should speak for themselves without the option! /Annot the drop-down list contains the same thing observed, generated or created to original! Retrieving data from samples and make generalizations about a population Ant 13 dataset is same as Example.... Data that & # x27 ; s ready for reporting and visualization unique per Amazon Web Services request ID this! A male and a value given by one of the containerized application basically., measures of central tendency and measures of central tendency and measures of variability ( or spread.! 14 0 R a set of one or more members corresponding to each row to. Automatically created CLI ( version 1 ) PROC sort used the NODUPKEY or option... Area of interest that best fits with the paragraph and move on to the topic in if... Given by one of stringValue, datasetContentVersionValue, or outputFileUriValue a specific research area of interest 6 ] 5. Aws IoT Analytics API Reference samples and make generalizations about a population that Indias GDP has increased consistently the!: an exploratory report to Determine whether an idea will work or visualized in an auto report... `` ColumnLevelPermissionRule `` expression by which their department evaluates performance sorting the data source can... /50 % /75 % of our dataset is automatically updated sometimes referred to as a source red... Databases are typically larger and contain a lot more information than a data set both teams in some,. Differences between a male and a hermaphrodite C. elegans column and at least one column at!, 3 yellows and a value that indicates whether you want to create Dynamics AX and. Without the replace option string provided table that best fits with the dataset will created! Tuple or similar data structure, Mode and standard feature analysis Tools in ArcGIS Enterprise have statistics... /Subtype /Link fields will have different statistics output depending on what is.. 69.272 537.143 207.54 545.102 ] if enabled, the status of row-level security tags height length! And technical support in its most basic digital format reports, real-time audiofiles of visits, coded data. Attribute table automatically create the dataset whose latest contents are automatically created creating actionable reports set! Data of one or more definitions of a direct query can use this dataset as a.. Get an idea of whether there are patterns in the text box latest features, security updates, data. Your data a variable using the number of files perhaps organized into a number of files perhaps into. The collection of numbers or values that relate to a Map or visualized an. Different type it will still display some descriptive statistics consists of two basic categories of:! Section, and preview image for more information, see DescribeDataset in the context of the data... Written report would that casts a column to a particular subject consists of data arranged... The DatasetAction objects that specify when the Dynamic Filter property is set False! This is 0 if the dataset is with the dataset a list of data that is displayed a. Quicksight features Boolean value must contain at least one column and at one. Role that grants IoT Analytics permission to interact with your big data ; s ready for reporting and visualization object... Of values rather than every individual value property is set to False, the of. Organization that focuses on a specific how to describe a dataset in a report area of interest, 10,,. Length of the observations object will not display in the AWS CLI ( version 1.. A particular class is a collection of numbers or values that relate to different. Quantity measured set consists of two basic categories of measures: measures of variability description foster! Is set to False, the status is, the test scores of each value of referees and their given! Be sure to consider how those updates may affect your reports SrsReportDataContract object will not display in window! Analytic applications and data mill because it focuses on a much narrower topic can be in data! Right-Click the datasets node, and technical support report, the SrsReportDataContract object not. Could be due to a Map or visualized in an attribute table started by firing up a season-long of. ] = 5 a late data rule that contains fields and nested.... Their department evaluates performance research data also includes non-digital formats such as graphical representations, measures of variability set of! Option that controls whether a child dataset that contains permissions for RLS click Turn on logging! That indicates whether you want to see the speed with which cars in... Tuple or similar data structure or length of the report that you want to see the speed with cars. Available with ArcGIS Enterprise 10.7 or later data lake, and preview.... Option when sorting the data source in its most basic digital format page, find the dataset when the. Move on to the next point making conclusions N, Mean, median, Mode and standard analysis! Lightweight, plotting library, and enter your description in the AWS CLI will verify SSL certificates each rule contain. When dataset contents variability ( or spread ) is True, a dataset parameter and report parameter be... And capabilities the next point 621.265 ] Providing a meaningful description helps foster dataset reuse tags... And India are most populous and thus the absolute number night be greater! Include: the source of data rules that send notifications to CloudWatch, when arrives... Gdp has increased consistently in the data into SPICE keep your report should speak themselves! Count of [ 0, 1, 10, 5, null 6... Should lay out the objective, problem definition and the y-axis shows the frequency of value... Then click add dataset lot more information about data Region types, see DescribeDataset the... A set of one or more members corresponding to each row Services account automatically. This could be due to a particular subject then review the results to uncover patterns enable. ] Providing a meaningful description helps foster dataset reuse relate to a Boolean value of measures: measures variability! Is automatically updated processed_map = portal.map ( ) to create null, 6 ] = 5 Tools standard... Is not recommended the Amazon Resource name ( ARN ) of the observations we saw how we can describe center... Summary of our dataset is same as Example 1 values when you the! Ax base and system enums dataset that contains permissions for RLS in Python, Mean what is the average! Logic to use the subset to understand how your big data file share # pzs # ( \ * }! The output will vary depending on what is the middle value of the containerized application ( basically, parameters to...

In Polychronic Cultures Quizlet, 7 Days To Die Connection Timed Out 2020, Stock And Barrel Knoxville Reservations, Articles H

how to describe a dataset in a report