{{'2016-02-15T13:00:55.5284474+00:00' | utcToLocalDate }}

PowerQuery – The power of M

Simon Sabin

I love PowerBI, actually I love PowerQuery. It's a great way to combine data from around your business, across the web from anywhere. What I really like is very little is done automatically, i.e. it doesn't do the nice data type detection you get with Excel that screws your imports if the data type in a column differs from the first few rows to the rest of the file.

Does that make it difficult to use. No its not. The nice thing is that its very easy to add additional columns, change data types, use expressions, combine datasets, and do much much more.

If you've ever used SSIS you will often find that the UI gets in the way of the experience. SSIS is really focused on a high performance pipeline. This bubbles far too much to the surface with the buffers and column mappings.

Lets walk through a simple example, of splitting a column that contains filename into the file and the extension

In SSIS

  1. Add a derived column component to your data flow
  2. Delete the existing connection between two data flow components
  3. Connect the upstream component to the derived column component
  4. Connect the downstream component to the derived column component
  5. Correct any mapping you had defined on the downstream component
  6. Open the derived column component
  7. Add a row for the new column
  8. Try and enter your expression in the single text box (or have downloaded BidsHelper to give you the expression editor ) to find the filename
  9. Make sure the length for the new field is going to be big enough
  10. Do the same to get the file extension
  11. Get some sample data
  12. Put in a data viewer
  13. Run the data flow to check the split is working correctly
  14. Click ok

How about in Power Query The wizard way (picture above)

  1. Click on the column you want to split
  2. Click the split column in the toolbar
  3. Specify the delimiter and how you want it split
  4. Click ok
  5. View the split is doing what you want

The wizard way (2)

  1. Click Add Custom Column
  2. Put as the function (Column1 is the source column and Source is the source table, Column1.1 and Column1.2 are the new column names"

Table.SplitColumn(Source,"Column1",Splitter.SplitTextByEachDelimiter({"."}, null, true),{"Column1.1", "Column1.2"})

  1. Click Ok

The non wizard way

  1. Open the Advanced editor for the query
  2. Add a new line where you want to add the column in the steps
  3. Add

SourceWithSplitColumn = Table.SplitColumn(Source,"Column1",Splitter.SplitTextByEachDelimiter({"."}, null, true),{"Column1.1", "Column1.2"})

  1. Change the line that was referencing the old variable to point to the new variable
  2. Click OK and view the changes

The real power query is that everything is just code, you don't have to deal with some bad UIs, the fall back is always a line of code. With SSIS if something is awful in the UI you had to resort to the advanced editor and that only worked if the component hooked everything up properly.

I was amazed at the library of functions available, json parsing, xml parsing, download from web, from sql and much much more. Discussing this with Matt Masson at ignite this month it all made sense ALL components are built using the M Language. If they need to do something that they can't do with the language they consider extending the language. Adding more functions, which means its then available to all.

Not seen Power Query, where do you start.

Well you can just jump create some queries, use the wizards and see whats generated, quickly you'll want to be jumping into the advanced editor (well I was) and with no intellisense you will stumble.

So once you've found your way around maybe watching some videos looking at samples.

https://www.sqlbits.com/Content/?type=3&tags=Power+Query

You really need to understand the basics of the language. I'd read the first few sections of the language specification (first item below) this explains syntax, and is really useful. Then there are all the functions you will want they are in the next 2.

  • Microsoft Power Query for Excel Formula Language Specification (PDF) – Comprehensive explanation of the Power Query Formula Language (informally known as "M"). Power Query embeds M documents in Excel workbooks to enable repeatable mashup of data.
  • Power Query formula categories – Concise Power Query formula reference with syntax, arguments, remarks, and examples for all formulas.
  • Power Query Formula Library Specification (PDF) – Concise Power Query formula reference with syntax, arguments, remarks, and examples for all formulas.

Happy reading

comments powered by Disqus