Friday, July 3, 2020

Simple Linear Regression in Power BI

The correlation coefficient is a statistical measure of the strength of the relationship between the relative movements of two variables. The values range between -1.0 and 1.0. A calculated number greater than 1.0 or less than -1.0 means that there was an error in the correlation measurement. A correlation of -1.0 shows a perfect negative correlation, while a correlation of 1.0 shows a perfect positive correlation. A correlation of 0.0 shows no linear relationship between the movement of the two variables.

The coefficient of determination, denoted R² or r² and pronounced "R squared", is the proportion of the variance in the dependent variable that is predictable from the independent variable(s).

We shall see how the data is related if we create a scatter plot in power bi

Scatter Plot in Power BI

1. Click the Scatter Plot visualization and add your columns. For e.g. Year Month, Count Days, and Amount to determine the relationship.

2. From the Analytics pane add a Trend Line



There definitely seems to be a correlation between days and amount, so now we will make the calculations to see if we are right.

      Creating the Coefficient of Correlation

  1. Right click on the table and click New quick measure
  2. Select Correlation coefficient from the Calculations under “Mathematical operations”.
  3. Select the Category, Measure X, and Measure Y. These columns will match the dot plot we created earlier.
  4. The Coefficient of Correlation will now be available in your table, and it’s ready for use.

    coefficient of correlation

    Creating the Coefficient of Determination

    In this case, a quick measure would be overkill. The Coefficient of Correlation is notated as the letter R. The Coefficient of Determination is R2.

    Coefficient of Determination = [Coefficient of Correlation]2

    We now have two statistics based on the data set that tells how and to what degree the variables are related.

    coefficient of determination

    As we can see, per our definitions above, both the Coefficient of Correlation and Determination are very close to 1. This means Days is certainly related to Sum of Amount and does a very good job of predicting how much we will spend given the number of days we do something.0

    Simple Linear Regression analysis is quite useful and prevalent across many business cases. Combining it with Power BI can create powerful analytical capabilities.

    Thanks :)

Sunday, June 28, 2020

Dynamic grouping of values in Power BI [Banding]

Sometimes there are needs when you would want to Power BI to give user the ability to choose the bands for a particular dimension and update the charts dynamically. In this example we are going to discuss one such example.

Business Case: User wants to see the count of customers within different age groups where he wants to change the age groupings dynamically and does not want fixed groups.

Lets Start:
We can start by looking into how we can produce a table where age groupings are defined:

Age Band = GENERATESERIES(0,120,1)

This will generate a table with all possible age values between 0 & 120.
We can simply connect it with our dimension table and have age from Age Group table as the X-axis to filter the count of customers in the dimension table.

Dynamic: Now this is static, we will dive deeper and make it dynamic.

The measure/trick is:

Count of Customers in each group - by size =
var _minAge=[Min Age]
var _maxAge=[Max Age]
var _bandSize=[What is Band Size Value]
var _AgeBandTable=GENERATESERIES(_minAge,_maxAge,_bandSize)
var _currAge=SELECTEDVALUE('Age Band'[Age])
var _bandHead=MAXX(
    FILTER(
    _AgeBandTable,
    [Value]<=_currAge),[Value])
var _bandTail=_bandHead+_bandSize
return
if(_currAge=_bandHead,
CALCULATE(
    COUNTROWS(DimCustomer),
'Age Band'[Age]>=_bandHead && 'Age Band'[Age]<_bandTail
))


Finally you can use the measure to dynamically update your visual.

Regards

Saturday, June 27, 2020

Dynamic Title in Power BI

Making your reports/dashboards more dynamic requires you to be attentive to small details. In my last post I discussed about how you can create a dynamic table with a slicer having multiple dimensions. In this post I want to talk about how to improve your dynamic tables and charts in reports. 

If you want to link the title of your chart on report page to a filter/slicer on the page, you can do so in many different ways. Lets discuss a few of them now
Here you have to make a title measure with one of the following options:

Option 1: If you want to see Sales by cities - connect you City filter as follows:
Title = "sales for the cities "&CONCATENATEX(data,data[City],",") Option 2: You can use SelectedValue function: Title = SELECTEDVALUE("Total Sales and GM by:", 'Dimension[dimension])

Option 3: You can use Concatenatex to make allow reading multiple selections (Thanks to 

Title =
"Sales Amount for "     & CONCATENATEX (                             VALUES( 'Dimension'[Dimension] ),                          ''Dimension [Dimension],                            ", "         )

Option 4: You can use HasOneValue. The measure uses the “HASONEVALUE” function to see if only one date is filtered in the DateTable. If only one date is selected it returns the dynamic title otherwise it returns a message that basically tells the user to select a single date.

    Title = 
    If (
        HASONEVALUE ( DATETable[Datekey]),
        "Sales as on : "& [Datekey], 
        "Please select one value"
         )


Thus, you will be able to have a dynamic filter for your charts/visualization.
After selecting an option you will have a measure which you can simply drag into a card visual or better multi-card visual. Multi-Card visual is left aligned by default, so you would not have any problem when you change the value in filter.

Thank you :)

Saturday, June 20, 2020

How to change the tables rows dynamically in Power BI !!!

Making Power BI more powerful using dynamic measures: In this post I will show you how you can use a slicer and filter the tables on multiple dimensions:

Business Problem: Suppose you have multiple dimensions e.g. Country, Region, Supplier, Customer etc. and you want to see the sales by year by these dimensions

This week got to be an exciting one for me where I had to perform some interesting operations in power BI. It's always better to create dynamic reports in whatever platform you use to be able to reuse the features again next time. In Power BI you would have to write some DAX measures for this. For such use this time I got to learn and implement some dynamic features in power bi which i will show you now.


You can install and configure the Adventure Works SQL database and practice or you can try to implement it on your own data.

Solution 1: Using the Hierarchy feature

We can leverage the hierarchy feature on visuals. In this case, we just need to put our 5 columns on the rows, and rows will change when one clicks on “Go to the next level in the hierarchy”.



But the problem is that in Power BI Services, users won’t be able to see the hierarchy, and if they want to split the table by country, they’ll have to click 4 times to the same button.


Solution 2: Un-Pivoting a table

To achieve this, make a copy of DimSalesTerritory table and rename to DimSalesTerritoryUnpivoted select the 3 columns in DimSalesTerritory table and on Unpivot columns. Then keep the 3 columns, which ends up by having the following table:

The table will look like this

                                                                


Link the table to the data model:


Change the cross-filter direction from single to both, because we’ll use the Attribute column from the new table to filter the DimSalesTerritory table. The attribute column is our slicer, and in our visual, the value column from the new table is our dynamic rows.

In the end, users have the flexibility to select a column in the slicer, and the matrix table is updated based on the user selection.



DAX table and measure:

The first step is to create a table with all columns name in a column, and all corresponding values in another column. We will leverage this later.

To make it dynamic, we would create a DAX table to achieve the same thing.

                                                    


                                                       


Now we need to create a measure on the FactResellerSales table to leverage this DAX table. The trick is to use the TREATAS DAX function, which “Treats the columns of the input table as columns from other tables. For each column, filters out any values that are not present in its respective output column.”




On the matrix table, I put the slicer table values in rows, and the SumSalesAmount (Dynamic Slicer) in values. The Slicer table type should be placed as a slicer. In the end, the result is fast, user friendly, and doesn’t change the data model drastically.


While implementing it on your own data, you just need to remember to update the measure that use this table whenever you add or remove a column in the table. To use this solution with many reports, the trick is to add a visual level filter to select columns you want your users to be able to change the visual with.

Hope this is helpful!

 


Sunday, June 14, 2020

Linked List Implementation of a Stack


Since by now you know Stack can be implemented in various ways -
Linked List implementation or Array implementation...

Its all about achieving Stack behavior and your constraint adjustment as per your needs, you can go either way (use Arrays or Linked List)-  here is one linked list implementation (Remember- you need to create your own linked list and do not use Java utility classes for such implementation), here is the code to help you understand:

************************************************
package Stack;

import java.util.Scanner;


// Stack Class
public class LinkedListImplementation {
public int size;
private Node node = null ;

LinkedListImplementation(){
new Node() ;
}

public boolean isEmpty() {
return node==null;
}

public int size() {
int size = 0;
Node temp = node;
while(temp !=null) {
temp = temp.next ;
size++;
}
return size;
}

public void push(String item) {
Node head = node;

node = new Node() ;
node.item = item;
node.next = head ;

}

public String pop() {
String item = node.getItem() ;
node = node.getNext() ;
return item;
}

public static void main(String args[]) {
Scanner sc = new Scanner(System.in) ;

LinkedListImplementation lstack = new LinkedListImplementation() ;
lstack.push("A");
lstack.push("B");
lstack.push("C");
lstack.push("D");

lstack.display();
System.out.println("");
System.out.println(lstack.size()) ;
lstack.pop();
lstack.display();

}

public void display() {
if(node == null)
return;
else {
Node temp = node;
while(temp!=null) {
System.out.print(temp.item+" -> ");
temp= temp.next;
}
}

}

}

****************************************************

/* Node Class  */

package Stack;

public class Node {

String item;
Node next ;

public Node(){
this.next = null;
this.item = null;
}

public String getItem() {
return item;
}
public void setItem(String item) {
this.item = item;
}

public Node getNext() {
return next;
}
public void setNext(Node next) {
this.next = next;
}


}

*************************************
This is very simple and easy implementation of Stack which provides all basic push, pop features..more features can be added as per needs.  


Saturday, June 13, 2020

Date Table in Power BI – Time Intelligence using DAX

In this post I want to walk you through how to Invoke Custom Function button in the Query Editor in Power BI and explain why it’s such a useful thing to have. One always needs time intelligence to have a full grasp of the data insights.
Use the below M query to create a date table:
//Create Date Dimension
(StartDate as date, EndDate as date)=>
let
    //Capture the date range from the parameters
    StartDate = #date(Date.Year(StartDate), Date.Month(StartDate),
    Date.Day(StartDate)),
    EndDate = #date(Date.Year(EndDate), Date.Month(EndDate),
    Date.Day(EndDate)),
//Get the number of dates that will be required for the table
    GetDateCount = Duration.Days(EndDate – StartDate),
//Take the count of dates and turn it into a list of dates
    GetDateList = List.Dates(StartDate, GetDateCount,
    #duration(1,0,0,0)),
//Convert the list into a table
    DateListToTable = Table.FromList(GetDateList,
    Splitter.SplitByNothing(), {“Date”}, null, ExtraValues.Error),
//Create various date attributes from the date column
    //Add Year Column
    YearNumber = Table.AddColumn(DateListToTable, “Year”,
    each Date.Year([Date])),
//Add Quarter Column
    QuarterNumber = Table.AddColumn(YearNumber , “Quarter”,
    each “Q” & Number.ToText(Date.QuarterOfYear([Date]))),
//Add Week Number Column
    WeekNumber= Table.AddColumn(QuarterNumber , “Week Number”,
    each Date.WeekOfYear([Date])),
//Add Month Number Column
    MonthNumber = Table.AddColumn(WeekNumber, “Month Number”,
    each Date.Month([Date])),
//Add Month Name Column
    MonthName = Table.AddColumn(MonthNumber , “Month”,
    each Date.ToText([Date],”MMMM”)),
//Add Day of Week Column
    DayOfWeek = Table.AddColumn(MonthName , “Day of Week”,
    each Date.ToText([Date],”dddd”))
in
    DayOfWeek
It asks for start and end date. Once entered the date then click Invoke, it will generate the query.

Creates the below custom table 


Learning Power BI Interface: Overview


One of the most talked about software nowadays in the visualization world, Power Bi is one of the most advanced softwares available out there. Coming from the Microsoft family of products, it provides robust and scalable solutions to your organizational needs. Competes directly with Tableau, Qlik Sense etc. Microsoft has designed an interface which is quiet easy and that presents excel like features. 


Interface: Power BI has interface has excel like features with Home, Insert, Modeling and View etc. on the top
On the left you can see Report, Data and Model tabs.
Report tab is the page where you view the visualizations
Data tab is the page where you can see the data tables in the data connected
Model tab is the where you can see relationships in the data


Connections:
You can connect to multiple data sources ranging from excel, csv, json, pdf and multiple others
  

Setup: You can install free Power Bi desktop version on your computer but it would not allow customs visualizations or Power BI service features in it. 

Working on Power BI: Basic Steps
Step 1: Connect to your data 
Step 2: Transform your data ->  Adding custom columns, changing headers, data types and other transformations on the data
Step 3: On reports tab, you can start visualizing your data for exploratory analysis
Step 4: Once you have finalized the structure of your report. You can save the report as a .pbix file 

It is one of the best softwares out there which will help you enter the data world on a powerful level.
I will cover the detailed features and advanced features in other posts. 
See you next time!

System Design, Approach a problem

Feature expectations 
  1. It is extremely important part, hence get a very clear understanding of whats the requirement for the question.
Estimations
  1. Next step is usually to estimate the scale required for the system. The goal of this step is to understand the level of sharding required ( if any ) and to zero down on the design goals for the system.
  2. For example, if the total data required for the system fits on a single machine, we might not need to go into sharding and the complications that go with a distributed system design.
  3. OR if the most frequently used data fits on a single machine, in which case caching could be done on a single machine.

Design Goals                                                                                                                             
Figure out what are the most important goals for the system. It is possible that there are systems which are latency systems in which case a solution that does not account for it, might lead to bad design.

Skeleton of the design 
30-40 mins is not enough time to discuss every single component in detail. As such, a good strategy is to discuss a very high level with the interviewer and go into a deep dive of components as enquired by the interviewer.

Deep dive 





Primary and Replica Databases

Master databases receive and store data from applications. Slave databases get copies of that data from the masters. Slaves are therefore read-only from the application's point of view while masters are read-write.
Writes to a database are more "expensive" than reads. Checking for data integrity and writing updates to physical disks, for example, consume system resources. Most web applications require a much higher ratio of reads to writes. For example a person may write an article once and then it’s read thousands of times. So setting up master-slave replication in the right scenario lets an application distribute its queries efficiently. While one database is busy storing information the others can be busy serving it without impacting each other.
Most often each master and slave database are run on separate servers or virtual environments. Each is then tailored and optimized for their needs. Master database servers may be optimized for writing to permanent storage. Slave database servers may have more RAM for query caching. Tuning the environments and database settings makes each more optimized for reading or writing, improving the overall efficiency of the application.

References :