Skip to main content
Showing results for 
Search instead for 
Did you mean: 

Microsoft is giving away 50,000 FREE Microsoft Certification exam vouchers. Get Fabric certified for FREE! Learn more

Reply

Pivot...UnPivot??? How to fix this bad data setup

I have a table called App Hx. It looks like this:

 

HxId

ParentIdCreatedDateOldValueNewValue
0178y00006qrgdCaAB8y0000004dK6GAI4/1/2024CreatedSubmitted
0178y000068XMB2aAB8y0000004dK6GAI4/2/2024SubmittedResponse Required
0178y000068XNshaAB8y0000004dK6GAI4/10/2024Response RequiredSubmitted
0178y000063XvcdaAB8y0000004dK6GAI4/15/2024SubmittedResponse Required
0178y000063Xr6aaAB8y0000004dK6GAI4/20/2024Response RequiredSubmitted
0178y00005zAXrvaAB8y0000004dK6GAI4/21/2024SubmittedPre-Approved
0178y00005oylvVaAB8y0000004dK6GAI4/21/2024Pre-ApprovedApproved

 

Question: Is there a way to turn the data above into a table like below in power query?

2nd Question: The table has over 800K rows. I am pretty sure, but would like confirmation: is this table is too big to perform this kind of feat in power query?

 

Parent IDEventStart DateEnd Date
aAB8y0000004dK6GAISubmitted4/1/20244/2/2024
aAB8y0000004dK6GAIResponse Required4/2/20244/10/2024
aAB8y0000004dK6GAISubmitted4/10/20244/15/2024
aAB8y0000004dK6GAIResponse Required4/15/2024 4/20/2024
aAB8y0000004dK6GAISubmitted4/20/20244/21/2024
aAB8y0000004dK6GAIPre-Approved4/21/20244/21/2024
aAB8y0000004dK6GAIApproved4/21/204 
2 ACCEPTED SOLUTIONS

Use this. That line was not needed.

let Source =Table.FromRows(Excel.CurrentWorkbook(){[Name="Table2"]}[Content]), Custom1 = [a = List.Skip(Table.ToColumns(Source)), b = {a{0}} & {a{3}} & {a{1}} & {List.Skip(a{1}) & {null}}, c = Table.FromColumns(b, {"Parent ID", "Event","Start Date", "End Date"}) ][c] in Custom1

 

View solution in original post

I think what yoy are asking for is result for a group of Parent ID. Then use this

 

let Source =Table.FromRows(Excel.CurrentWorkbook(){[Name="Table2"]}[Content]), Custom1 = Table.Combine(Table.Group(Source, {"ParentId"}, {"All", each [a = List.Skip(Table.ToColumns(_)), b = {a{0}} & {a{3}} & {a{1}} & {List.Skip(a{1}) & {null}}, c = Table.FromColumns(b, {"Parent ID", "Event","Start Date", "End Date"}) ][c]})[All]) in Custom1

View solution in original post

20 REPLIES 20

Use this. Source below is a sample one only.

let Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjA0t6g0AAKzwqL0FGclHaVERyeIiIGBSYq3mbujJ1DQRN9Q38jAyATIdC5KTSxJTQGygkuTcjNLQOxYHWSTLCJ8nYxwm2QEMwmhX0cpKLW4ID+vOFUhKLWwNLMIi5l+xRl4XGcAMxTTINwONY4oS07BY6gpOS41jigyS8Tje3JcalrlGFFUhsdQQ2wuDShK1XUsKCjKL8MwL78ypyyMGPNQzNBRQhgXCwA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [HxId = _t, ParentId = _t, CreatedDate = _t, OldValue = _t, NewValue = _t]), Custom1 = [a = List.Skip(Table.ToColumns(Source)), b = {a{0}} & {a{3}} & {a{1}} & {List.Skip(a{1}) & {null}}, c = Table.FromColumns(b, {"Parent ID", "Event","Start Date", "End Date"}) ][c] in Custom1

 

I must be doing something wrong because I get a error of Token Idenfitie expected at the 2nd let.  This is what I have: 

let Source =Table.FromRows(Excel.CurrentWorkbook(){[Name="Table2"]}[Content]), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [HxId = _t, ParentId = _t, CreatedDate = _t, OldValue = _t, NewValue = _t]), Custom1 = [a = List.Skip(Table.ToColumns(Source)), b = {a{0}} & {a{3}} & {a{1}} & {List.Skip(a{1}) & {null}}, c = Table.FromColumns(b, {"Parent ID", "Event","Start Date", "End Date"}) ][c] in Custom1

Use this. That line was not needed.

let Source =Table.FromRows(Excel.CurrentWorkbook(){[Name="Table2"]}[Content]), Custom1 = [a = List.Skip(Table.ToColumns(Source)), b = {a{0}} & {a{3}} & {a{1}} & {List.Skip(a{1}) & {null}}, c = Table.FromColumns(b, {"Parent ID", "Event","Start Date", "End Date"}) ][c] in Custom1

 

I really, really need help. Sorting the data by Id and CreatedDate did not help. I am getting very strange results. If I filter the data by a single ParentId before adding the code, then I get the expected results. But if I try to add the code to unfiltered data, something strange happens.

Sample Data:

IdParentIdCreatedDateOldValueNewValue
0178y00005LfiTpAAJaAB8y0000004TjkGAE07/12/2023IncompleteSubmitted
0178y00005cob9UAAQaAB8y0000004TjkGAE08/10/2023SubmittedPreApproved
0178y0000645cErAAIaAB8y0000004TjkGAE10/04/2023PreApprovedApproved

 

Expected Result:

Parent IDEventStart DateEnd Date
aAB8y0000004TjkGAESubmitted07/12/202308/10/2023
aAB8y0000004TjkGAEPreApproved08/10/202310/04/2023
aAB8y0000004TjkGAEApproved10/04/2023null

 

But here is what I get!

Parent IDEventStart DateEnd Date
aAB8y0000004TjkGAESubmitted07/12/202307/12/2023
aAB8y0000004TjkGAEPreApproved08/10/202308/10/2023
aAB8y0000004TjkGAEApproved10/04/202310/04/2023

 

Thank you so much for your help so far. But I really need more assistance. I have worked on this project for 2 weeks. I have to have something produced by Monday.

I think what yoy are asking for is result for a group of Parent ID. Then use this

 

let Source =Table.FromRows(Excel.CurrentWorkbook(){[Name="Table2"]}[Content]), Custom1 = Table.Combine(Table.Group(Source, {"ParentId"}, {"All", each [a = List.Skip(Table.ToColumns(_)), b = {a{0}} & {a{3}} & {a{1}} & {List.Skip(a{1}) & {null}}, c = Table.FromColumns(b, {"Parent ID", "Event","Start Date", "End Date"}) ][c]})[All]) in Custom1

I really hate that I have to report this, but it is still not giving me the proper values. Example:

IdParentIdCreatedDateOldValueNewValue
0178y00006YNsBmAALaAB8y0000000b6BGAQ11/20/2023DeniedSubmitted
0178y00006ZfHleAAFaAB8y0000000b6BGAQ11/22/2023SubmittedResponse Required
0178y00006ZrGuVAAVaAB8y0000000b6BGAQ11/22/2023Response ReqiredSubmitted
0178y00006bjYj5AAEaAB8y0000000b6BGAQ11/27/2023SubmittedApproved

Code I used (edited to filter data for all records with a specific ParentId to test results:

let Source = Access.Database(File.Contents("C:\Users\scastille539\AppData\NA_APP_TIMELINES\NA_App_Timeline.accdb"), [CreateNavigationProperties=true]), _tblNAHistory = Source{[Schema="",Item="tblNAHistory"]}[Data], #"Sorted Rows" = Table.Sort(_tblNAHistory,{{"Id", Order.Ascending}, {"CreatedDate", Order.Ascending}}), #"Removed Columns" = Table.RemoveColumns(#"Sorted Rows",{"CreatedById"}), #"Filtered Rows" = Table.SelectRows(#"Removed Columns", each [ParentId] = "aAB8y0000000b6BGAQ"), #"Sorted Rows1" = Table.Sort(#"Filtered Rows",{{"CreatedDate", Order.Ascending}, {"Id", Order.Ascending}}), Custom1 = Table.Combine(Table.Group(#"Sorted Rows1", {"ParentId"}, {"All", each [a = List.Skip(Table.ToColumns(_)), b = {a{0}} & {a{3}} & {a{1}} & {List.Skip(a{1}) & {null}}, c = Table.FromColumns(b, {"Parent ID", "Event","Start Date", "End Date"}) ][c]})[All]) in Custom1

 Result - it is completely wrong :disappointed_face:

Parent IdEventStart DateEnd Date
aAB8y0000000b6BGAQSubmitted11/20/202311/27/2023
aAB8y0000000b6BGAQApproved11/27/202311/22/2023
aAB8y0000000b6BGAQSubmitted11/22/202311/22/2023
aAB8y0000000b6BGAQResponse Required11/22/2023null

 

My co-worker has created an incredibly complicated series of queries and vba code in MS Access. I wanted to avoid using it. But since I can't seem to get this to work, I have no choice.

 

Thank for trying to help. Your patience is commendable.

Maybe now you are ready to follow my recommendations?  We've been through the same exercise before in our company and have successfully solved it.

Okay, Mr. Super User:  If I could follow your recommendation I would give it a try. But I do not have the faintest idea how to implement it. I am an advanced beginner (maybe intermediate) user of Power Query. And your replies have been well and truly way over my head.

And, from what I could make sense of, your solution involved getting the data directly from SF. I do not want to get all that data without filtering it. It is HUGE.

We are currently pulling the SF object (application history) into Excel power query, and performing what filtering we can. Then an MS Access database links to the query. From there we created a make-table query to aggregate the data for the minimum Approved date. Then we use the new table to filter the original history (by ParentId) to only history items for applications with an approved date in the history.

It is from this point we need to figure out the dates an Application was in Submitted Status (start and end dates), and calculate the business days for each event.

Why are we using Ms Access and Excel and not Power BI? Because we work for state govenment. We just got Power BI this year. None of us are skilled enough to use it or had any formal training with it. I have only dabbled with Excel's data model and have created some simple DAX expressions and power pivots.

So here we are jumping through hoops using the ancient tools we have to move giant data.

The requestor wants a table of raw data along with the summary. And it has to be in Excel format.

Columns:

- App No

- App Type

- First Submitted Date

- Final Submitted Date

- Time in Submitted Status (this is where we need to create a table of Start and End Dates to determine the number of business days between each event and then total those sums.)

 

There are more columns, but it is pointless to go on any further.

So, if you have any helpful to add, it would be appreciated. In the meantime, please do not scold me. I did not intentionally ignore you. I just did not understand wth you are asking me to do.

 

 

Sorry to hear about you having to use MSAccess but it may actually turn out to be an advantage later on.

 

First step:  You must use the CaseSafeID for the Application object.  The 15 digit ID is not sufficient

Second:  Your Created Date must include the complete timestamp, including the seconds and (if available) the milliseconds.

 

Is that something you can do? Also, what is "huge" ?  how many rows in the Application object and how many rows in the Application History object?

The Object ID is 18 digits. The only time we get 15 digit object ids are when we are connecting to a SF Report. In this instance, we are connecting directly to the Object.

The Created Date does not contain a valid time - they are all 12:00:00 AM.

I do not know the total rows in the Application object, but it is 100's of thousands.

Same for the History Object. I think even after I filtered the Field column for only Application Status, it was still over 800k.

 

See if you can get the timestamp directly from the Application Field History object. Without it you will have a very hard time sorting the events. If multiple events happened on the same day you would not be able to determine the cadence.

 

800K is considered small.  Remember that the Field history data is immutable so you don't have to repull old data over and over - you can run a process that only fetches the changes since the last time you looked.  Note:  Salesforce Object connector queries do actually fold if you limit yourself to simple filters - they will produce custom SOQL. That means you could use Incremental Refresh if you were using Power BI Service.  But even with Access you can implement something similar.

 

For the Application object itself you could do a query with the Last Modified Date to reduce the amount of data to pull, and then a script in Access to swap out the rows that have changed and add the rows for new applications.

 

You need the Application object for cases where there is no entry in the Application Field History object.

The timestamp is just not available. And, yes there are multiple entries in the history that occur on the same date. My co-worker created multiple queries in Access. We created a mother table. Then 2 temp tables with all the Start Dates (NewValues) & EndDates(OldValues)-he said he had some code that was going to index these tables. Then we run another make table query that temporarily holds the min value from each of these tables, then another query appends that to the mother table. Then queries to delete the values in the temp tables. There is a function that loops through the tables until it finishes. I haven't seen the code he developed to index the tables, so I am curious to see if it works.

Then he has more vba code to calculate the business day difference which literally takes about an hour to update a table with over 69K rows. And that is just one of the tables.

Monday is going to suck.

After examining the data, I ran into issues. Unfortunately, the data is so messed up, I have null values in the new and old value columns. Sample:

DateOldValueNewValue
6/28/23SubmittedIncomplete
5/4/2024Incomplete 
6/10/2024 Withdrawn

 

Is there a way to deal with that? 

What is the output expected from the sample you just posted? Accordingly, I can change the code.

Hello, 

I think I figured it out. Before I insert code you provided, I need to have the table sorted in in Hx Id  and CreatedDate order.

Example:

This was the table straight from source:

Txtcher_0-1745693226924.png

After adding custom step - note the null value next to Response Required it should be 11/22/2023 12:00:00 AM, not null

Txtcher_1-1745693385742.png

After inserting a step to sort the data by ascending CreatedDate and Id, I get this after the Custom Step, which is the correct version of the history:

Txtcher_2-1745693549191.png

So thank you so much. All is well.

 

This is a very standard process with clear rules for identifying snapshot values

 

1. if there is no entry in the field history that means the current value is used

2. if there are entries in the field history before the snapshot date then use the New Value of the latest entry

3. if there are entries in the field history after the snapshot date then use the Old Value of the earliest entry

4. if the target field is blank then move on to the next extry

Oh wow. You may have just saved us hours of work. Thank you so very much for sharing.

1. The SFDC Field History data is perfect as it is. No need to pivot or unpivot

2. NOTE : !!! NOTE : Both SFDC and Power Query are case sensitive. Power BI IS NOT case sensitive. If you cannot get the CaseSafeID from SFDC you MUST create it yourself in Power Query.

 

= (ObjectID as text) as text => let AN = "ABCDEFGHIJKLMNOPQRSTUVWXYZ", NFTC = (pos) => Number.From(Text.Contains(AN, Text.At(ObjectID,pos))), FirstIndex = NFTC(0) + NFTC(1)*2 + NFTC(2)*4 + NFTC(3)*8 + NFTC(4)*16, SecondIndex = NFTC(5) + NFTC(6)*2 + NFTC(7)*4 + NFTC(8)*8 + NFTC(9)*16, ThirdIndex = NFTC(10) + NFTC(11)*2 + NFTC(12)*4 + NFTC(13)*8 + NFTC(14)*16, ANP = AN & "012345", CaseSafeID = ObjectID & Text.At(ANP, FirstIndex) & Text.At(ANP, SecondIndex) & Text.At(ANP, ThirdIndex) in CaseSafeID

Thank  you for your response. Unfortunately, I can't make any sense of it. What is SFDC? And why is the history perfect as it? The code makes no sense to me either. What does it do?

 

We need to calculated the cumulative time an Application was in a given status.

 

My apologies but you completely lost me.

SFDC = Salesforce.com - that's where your data comes from.

Helpful resources

Announcements
PBIApril_Carousel

Power BI Monthly Update - April 2025

Check out the April 2025 Power BI update to learn about new features.

Notebook Gallery Carousel1

NEW! Community Notebooks Gallery

Explore and share Fabric Notebooks to boost Power BI insights in the new community notebooks gallery.

April2025 Carousel

Fabric Community Update - April 2025

Find out what's new and trending in the Fabric community.

Top Solution Authors
Top Kudoed Authors
Users online (2,913)
  • Olashileayomi
  • danextian
  • data-marty
  • AllisonKennedy
  • persepta
  • GilbertQ
  • NROBBIANO
  • dcheng029
  • tsosa121
  • ashmitp869
  • maj07msu
  • clongwell
  • DeanTCC
  • Starlynk
  • RollingRock2315
  • EmmanouilTroula
  • zwodek
  • ug_nonsso_user
  • Jordmwhite
  • frithjof_v
  • gautampruthi
close