.. image:: /_images/icons/app_simlytiks.png :width: 5% :align: right :target: index.html .. |DataProfiler| image:: /_images/icons/icon_DataProfiler.png :width: 4% :target: DataProfiler.html .. _DataProfiler: #################### |Data Profiler| #################### New feature Data profiler is added to Simlytiks which includes Data Summary Schema, Data distribution and frequency and Relationships .. video:: _static/movies/dataprofiler.mp4 :width: 100% Now we have new mosaic view support for Image Gallery in Data profiler. .. video:: _static/movies/dataprofilermosac.mp4 :width: 100% Heatmap in Data profiler shows scatterplot when we click on the values .. video:: _static/movies/dataprofilerheatmap.mp4 :width: 100% We see cross plot matrix when we click on Heatmap cells in Data profiler and we have option to switch between scatterplot and QQ plot. .. video:: _static/movies/dataprofilerheatmapcrossqq.mp4 :width: 100% Coloring of a cross plot in data profiler will always be between 3 colors and ignore numerical binning. .. thumbnail:: /_images/Images/colorcrossplotdata.png :title: Coloring cross plot .. centered:: :sup:`Coloring cross plot` When there are categorical variables in the dataset, there will be an option to view the relationship among them. For numerical variables, Pearsons's correlation coefficients are shown. For categorical variables, a Chi square test is conducted and the p-value from the test is returned for reference. When the cursor is hovered over the cells, a tooltip with more information on the test will be displayed. When the p-value is smaller than the significance level (a commonly used value is 0.05), we conclude that there is enough evidence to conclude that the two variables studied are not indepedent. That is, they are correlated. .. thumbnail:: /_images/Images/relationship_cate_pvalue_table.png :title: Relationship table for categorival variables .. centered:: :sup:`Coloring cross plot` QQPLOT MATRIX is created for all QQPLOTS in Data Profiler Heatmaps. .. video:: _static/movies/qqplotmatrixdata.mp4 :width: 100% Data Profiler larger font-size for column names in schema table and data table. .. thumbnail:: /_images/Images/largefontsize.png :title: Font size for Data Profiler .. centered:: :sup:`Font size for Data Profiler` Data Profiler now has new representation for Inputs/Objectives and Targets. .. video:: _static/movies/targets.mp4 :width: 100% Thresholds from targets are now grayed out based on an option (within Targets) and shown in the cross plots for Data Profiler. .. thumbnail:: /_images/Images/targetsthresh.png :title: Cross plots for Data Profiler .. centered:: :sup:`Cross plots for Data Profiler` In Data Profiler we can right click on the Heatmap and enlarge the QQ Plot Matrix. .. video:: _static/movies/enlargeqqplot.mp4 :width: 100% In Data Profiler we use summary visualization in Data Summary and Structure column. .. thumbnail:: /_images/Images/datasummary.png :title: Data Summary and Structure column .. centered:: :sup:`Data Summary and Structure column` Workers can be searched using Machine Learning in Data Profiler. .. thumbnail:: /_images/Images/dataml.png :title: Workers searched .. centered:: :sup:`Workers searched` In Data Profiler, Larger ML Datasets are rendered easily to view all charts .. thumbnail:: /_images/Images/datamlloading.png :title: Data Profiler for Large ML Datasets .. centered:: :sup:`Data Profiler for Large ML Datasets` In Data Profiler, Larger column ML Datasets has pagination within Heatmap and Scatter matrix Plot. .. video:: _static/movies/mllarger.mp4 :width: 100% In Data Profiler, We can enlarge the visualization in Data distribution .. thumbnail:: /_images/Images/enlargeviz1.png :title: Enlarge visualization .. centered:: :sup:`Enlarge visualization` In Data Profiler, We can change the chart type to Histogram and Violin chart for visualizations in Data distribution .. thumbnail:: /_images/Images/Charttype1.png :title: Chart Type for Visualization .. centered:: :sup:`Chart Type for Visualization` .. thumbnail:: /_images/Images/Charttype2.png :title: Chart Type for Visualization .. centered:: :sup:`Chart Type for Visualization` Search filter is included for data distribution and frequency column .. thumbnail:: /_images/Images/searchfilter.png :title: Search filter .. centered:: :sup:`Search filter` Heatmap has several settings in header like view type , pagination, page size and filter by co-eff .. thumbnail:: /_images/Images/heatmapsettings1.png :title: Heatmap Settings .. centered:: :sup:`Heatmap Settings` .. thumbnail:: /_images/Images/heatmapsettings2.png :title: Heatmap Settings .. centered:: :sup:`Heatmap Settings` .. thumbnail:: /_images/Images/heatmapsettings3.png :title: Heatmap Settings .. centered:: :sup:`Heatmap Settings` We can add groups in Data Profiler using Groups option in Header .. thumbnail:: /_images/Images/groups12.png :title: Groups .. centered:: :sup:`Groups` We can add Filters in Data Profiler using Filters option in Header .. thumbnail:: /_images/Images/filters12.png :title: Filters .. centered:: :sup:`Filters` In Data Profiler, Data groups are created and now have support for coloring groups using Color by groups option .. thumbnail:: /_images/Images/colorbygroups.png :title: Color by groups .. centered:: :sup:`Color by groups` Groups in Data Profiler can be applied as Filters using an option at the top (Filter by group) which adds a Data Filter in Simlytiks .. thumbnail:: /_images/Images/colorbygroupsandfilters.png :title: Color by groups and Filters .. centered:: :sup:`Color by groups and Filters` Sections in Data Profiler can be added to Basket and viewed as Basket items .. thumbnail:: /_images/Images/basket.png :title: Add to basket .. centered:: :sup:`Add to basket` Sections in Data profiler can be exported as CSV and ZIP file .. thumbnail:: /_images/Images/exportsections.png :title: Export Sections .. centered:: :sup:`Export Sections` View 3D option is available for visualizations in Data Profiler .. thumbnail:: /_images/Images/3doptions.png :title: 3D Option .. centered:: :sup:`3D Option` In Data Profiler, We can clone the visualization and paste it on the new page in Simlytiks dataset .. thumbnail:: /_images/Images/cloneandpaste.png :title: Clone and Paste visualization .. centered:: :sup:`Clone and Paste visualization` .. thumbnail:: /_images/Images/cloneandpaste1.png :title: Clone and Paste visualization .. centered:: :sup:`Clone and Paste visualization` Basket items are sortable now in Data profiler .. thumbnail:: /_images/Images/basketitemsosrt.png :title: Sort Basket items .. centered:: :sup:`Sort Basket items` We can Zoom in to the visualization and Highlight a part of visualization for sections in Data Profiler .. thumbnail:: /_images/Images/zoomandhighlight.png :title: Zoom and Highlight .. centered:: :sup:`Zoom and Highlight` .. thumbnail:: /_images/Images/zoomandhighlight1.png :title: Zoom and Highlight .. centered:: :sup:`Zoom and Highlight` In Data Profiler, We are able to switch between Relationship matrix and Chord diagram. .. thumbnail:: /_images/Images/chorddata.png :title: Relationship matrix and Chord diagram .. centered:: :sup:`Relationship matrix and Chord diagram` Data Profiler Relationship section now shows a Categorical Relationship Matrix separately for categorical values. .. video:: _static/movies/datacategorical.mp4 :width: 100% | Heatmap now has several settings in Header with filter like View Type, Pagination Position, Page Size and Filter by Co-Efficient. .. video:: _static/movies/heatmapsettings1.mp4 :width: 100% | In Data Profiler, assign colors for targets - KPI → Targets → add a new target, gray out the non-existing records and color the ones which are within the target range. .. video:: _static/movies/datakpitargets.mp4 :width: 100% | In Data Profiler, New Visualization Column Summary is added to summarize a numerical value. .. video:: _static/movies/colunsummary.mp4 :width: 100% | In Data Profiler, View Lucy ML now shows the actual Lucy response page with primary and non primary responses. .. video:: _static/movies/dataklucyml.mp4 :width: 100% | Added kriging_interpolation with multiple inputs, targets and also added simulated_annealing output. Scroll down to see the visualizations created. .. video:: _static/movies/kriginginterpolation.mp4 :width: 100% | In Data Profiler, Visualizations added from the KPI tab along with provided inputs/objectives/targets can be easily removed. .. video:: _static/movies/vizaddedkpi.mp4 :width: 100% | When any Input, Objective or Target is changed, accordingly it re-renders the charts with new options to make charts responsive to the changes in the Data Triggers view. .. video:: _static/movies/targetinputobjective.mp4 :width: 100% | In Data Profiler, Data Distribution and Frequency has inline search filters to view responses. .. video:: _static/movies/datadistribution.mp4 :width: 100% | In Data Profiler Groups, new section for Groups is added that will be in sync with Tools. .. video:: _static/movies/groupsnew.mp4 :width: 100% | In Data Profiler, new section for Filter is added that will help to filter the charts and the headers are fixed so they scroll down along with the page. .. video:: _static/movies/newsectionfilter.mp4 :width: 100% .. video:: _static/movies/headersfixed.mp4 :width: 100% | Relationships section in DataProfiler now has filters for search & dropdown list for response names which can be used to filter grouped box plots or stacked plots .. video:: _static/movies/RELATIONSHIPSFILTERS.mp4 :width: 100% | Performance improvement in Data Profiler and Data table rendering for LARGE datasets. .. video:: _static/movies/dataprofilerimprovement.mp4 :width: 100% | In DataProfiler, the KPI tabs are updated with new UI which is more user friendly and has Execute with spinner and log container with default options in settings. .. video:: _static/movies/dataprofilernewUI.mp4 :width: 100% |