Tuesday, September 18, 2018

Power BI - PDF file connector.

Data connectivity - PDF file connector

This month Power BI released a preview of the PDF Connector, which has been a huge ask from the community. In order to try it out you will need to enable it under the Preview features list in the Options dialog.


After enabling this Preview feature
and restarting Power BI Desktop, the PDF File connector will appear under the File category in the Get Data dialog.




After selecting this connector, you will be prompted to specify a path to a PDF file. Once the file is specified, Power Query will extract tables automatically and present them to you in the Navigator dialog, where you can preview and select one or multiple tables.




For Reference: PDF data would appear in Power BI as below:





















Tuesday, September 11, 2018

Data Lake vs Data Warehouse by GT


What is a data lake?
Some mistakenly believe that a data lake is just the 2.0 version of a data warehouse. While they are similar, they are different tools that should be used for different purposes. James Dixon, the CTO of Pentaho is credited with naming the concept of a data lake. He uses the following analogy:
“If you think of a datamart as a store of bottled water – cleansed and packaged and structured for easy consumption – the data lake is a large body of water in a more natural state. The contents of the data lake stream in from a source to fill the lake, and various users of the lake can come to examine, dive in, or take samples.”
A data lake holds data in an unstructured way and there is no hierarchy or organization among the individual pieces of data. It holds data in its rawest form—it’s not processed or analyzed. Additionally, a data lakes accepts and retains all data from all data sources, supports all data types and schemas (the way the data is stored in a database) are applied only when the data is ready to be used.
What is a data warehouse?
A data warehouse stores data in an organized manner with everything archived and ordered in a defined way. When a data warehouse is developed, a significant amount of effort occurs during the initial stages to analyze data sources and understand business processes. Decisions are made regarding what data to include and exclude from the warehouse. Data is only loaded into the warehouse when a use for the data has been identified.
How do data lakes and data warehouses compare?
Data
Data lakes retain all data—structured, semi-structured and unstructured/raw data. It’s possible that some of the data in a data lake will never be used. Data lakes keep all data as well. A data warehouse only includes data that is processed (structured) and only the data that is necessary to use for reporting or to answer specific business questions.
Agility
Since a data lake lacks structure, it's relatively easy to make changes to models and queries. Data lakes are more flexible and can be configured and reconfigured as necessary based on the job you need it to do. It's much more cumbersome and time-consuming to change the structure of a data warehouse due to the number of business processes tied to it.
Users
Data scientists are typically the ones who access the data in data lakes because they have the skill-set to do deep analysis. Technically, data lakes can support all users and are available to all. Data warehouses are used by specific business users to report and extract a particular meaning from the data that was defined when the data warehouse was set up; they are usually too restrictive for data scientists who need to go beyond the boundaries of the warehouse to glean new analysis from the data.
Security
Since data warehouses are more mature than data lakes, the security for data warehouses is also more mature. There is also concern that since all data is stored in one repository in a data lake that it also makes the data more vulnerable. It certainly makes auditing and compliance easier with just one store to manage.
Data lakes and data warehouses are different tools for different purposes. If you already have an established data warehouse, you might choose to implement a data lake alongside it to solve for some of the constraints you experience with a data warehouse. To determine whether a data lake or data warehouse is best for your needs, you should start with the goal you are trying to achieve and use the data repository that will help you meet your goal.

Tuesday, September 4, 2018

Visualization interactions in a Power BI report.

Visualization interactions in a Power BI report

Get PBIX here.

If you have edit permissions for a report, you can use Visual interactions to change how visualizations on a report page impact each other.
By default, visualizations on a report page can be used to cross-filter and cross-highlight the other visualizations on the page. For example, selecting a state on a map visualization highlights the column chart and filters the line chart to display only data that applies to that one state. And if you have a visualization that supports drilling, by default, drilling one visualization has no impact on the other visualizations on the report page. But both of these default behaviors can be overridden, and interactions set, on a per-visualization basis.
This article shows you how to use Visual interactions in Power BI service Editing view and in Power BI Desktop. If a report has been shared with you, you will not be able to change the Visual interactions settings.
Note
The terms cross-filter and cross-highlight are used to distinguish the behavior described here from what happens when you use the Filters pane to filter and highlight visualizations.

  1. Select a visualization to make it active.
  2. Display the Visual Interactions options.
    • In Power BI service, select the dropdown from the report menubar.
      Visual interactions dropdown
    • In Desktop, select Format > Interactions.
  3. To turn on the visualization interaction controls, select Edit interactions. Power BI adds cross-filter and cross-highlight icons to all of the other visualizations on the report page.

  4. Determine what impact the selected visualization should have on the others. And, optionally, repeat for all other visualizations on the report page.
    • If it should cross-filter the visualization, select the filter icon filter icon.
    • If it should cross-highlight the visualization, select the highlight icon highlight icon.
    • If it should have no impact, select the no impact icon no impact icon.
  5. To turn on drilling controls, select Drilling filters other visuals. Now when you drill down (and up) in a visualization, the other visualizations on the report page change to reflect your current drilling selection.

Get PBIX here.