cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

How do you manage nextPageKey automatically when extracting lots of data from Dynatrace?

Hello Community people!

 

more and more often I'm asked about big enough data extraction from Dynatrace.
I usually work with Dynatrace api v2 and Power Query (M) from Microsoft.

When I hit the limit for the data extracted I try to segment the api query itself in order to not have the nextPageKey cursor but sometime it is inevitable.

So my question is: how do you usally manage to extract data when nextPageKey is showed in your response load?

Are you scripting your costum solution in order to continue the extraction automatically? (without everytime re-typing the uri with the nextPageKey cursor?)

 

What other tools do you usually use in order to extract the data with Dynatrace apiv2?


Thanks in advance to everybody who will join this conversation and will contribute,

Regards

14 REPLIES 14

ChadTurner
DynaMight Legend
DynaMight Legend

we just leverage the clearing of the API and provide the key for the next page and so on. Its simple enough but im sure you could automate it with a postman script. 

-Chad

AntonioSousa
DynaMight Guru
DynaMight Guru

Same as @ChadTurner , we have done it in Linux shell, python and even MSDOS .bat scripts 😂😂😂

Have had some problems, but it was in our coding...

Antonio Sousa

Julius_Loman
DynaMight Legend
DynaMight Legend

I mostly use Python or cURL to fetch the data. But I'm really interested in how the paging should be used in the Power Query feature of Excel. 

Certified Dynatrace Master | Alanata a.s., Slovakia, Dynatrace Master Partner

Mike_L
Dynatrace Guru
Dynatrace Guru

I usually use the Python API client, that one handles the next page key automatically so I don't have to care about it 😀

https://github.com/dynatrace-oss/api-client-python

To get all hosts and even handle the too many concurrent request message automatically is as simple as:

dt = Dynatrace("environment_url", "api_token", too_many_requests_strategy=TOO_MANY_REQUESTS_WAIT )
for entity in dt.entities.list('type("HOST")', fields="properties.memoryTotal,properties.monitoringMode"):
    print(entity.entity_id, entity.display_name, entity.properties)
Mike

Thank you to everybody who answered.

I'm taking my time to go trought the process of learning, thinking, practicing. Especially thank you for pointing me to the DT git repo. direction.

ct_27
DynaMight Pro
DynaMight Pro

I have the same problem and have yet to find a simple solution as well.  Mike_L's link is honestly extremely helpful but I'd rather not rely on it since it doesn't support all API's and it's not logic i can apply elsewhere.  I've tried similar approaches using postman and insomnia REST Clients but they handle nextpages in a way different than DT approaches it.

 

I've coded solutions in Python and PHP but they're clunky and honestly my code is embarassing to get it working.

 

I'm posting in response because the "Next Page" thing is actually a cause sometimes for me to just not want to use the API.  I know I'll only get a partial result set and it's too time consuming to put all the JSON responses together. This is even an issue with PowerBI and Excel reporting because they too can't handle the API's NextPage thing.

 

Is DT considering any solutions for this?  Being that DT is all API First it would be helpful to the clients if the APIs can be a bit more client friendly.  (don't get me start on inconsistent FROM string defaults)

HigherEd

satit_dpm
Contributor

Hi,

 

I have created a PowerQuery to go through pagination and get the whole list. 
What you need to do is to create Function and call it recursively.

In my case, I need to query all problems from last quarter.
So I have created this function named getProblemAPIResult(), and from the code below you can see it is calling itself passing nextPageKey.

 

(api as text, headers as record, parameters as text, nextPageKey as text, currentList as list) => 
let
    apiResult = if nextPageKey = ""
    then Json.Document(Web.Contents(api & parameters, headers))
    else Json.Document(Web.Contents(api & "?nextPageKey=" & nextPageKey, headers)),
    newList = List.Combine({currentList, apiResult[problems]}),
    hasNext = try apiResult[nextPageKey],
    returnList = if hasNext[HasError] 
    then newList
    else getProblemAPIResult(api, headers, parameters, apiResult[nextPageKey], newList)
in
    returnList

 
Then the rest is easy, you call this function passing API, all necessary parameters and keep last 2 parameters empty.
Only the function calling matters, the rest are just table transformation.

let
    problems = getProblemAPIResult(APIURL & "/v2/problems", [Headers=[Accept="application/json; charset=utf-8", Authorization="Api-Token " & APIToken]], "?pageSize=500&from=" & fromTime & "&to=" & toTime, "", {}),
    #"Converted to Table" = Table.FromList(problems, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"problemId", "displayId", "title", "impactLevel", "severityLevel", "status", "affectedEntities", "impactedEntities", "rootCauseEntity", "managementZones", "entityTags", "problemFilters", "startTime", "endTime"}, {"Column1.problemId", "Column1.displayId", "Column1.title", "Column1.impactLevel", "Column1.severityLevel", "Column1.status", "Column1.affectedEntities", "Column1.impactedEntities", "Column1.rootCauseEntity", "Column1.managementZones", "Column1.entityTags", "Column1.problemFilters", "Column1.startTime", "Column1.endTime"}),
    #"Expanded Column1.managementZones" = Table.ExpandListColumn(#"Expanded Column1", "Column1.managementZones"),
    #"Expanded Column1.managementZones1" = Table.ExpandRecordColumn(#"Expanded Column1.managementZones", "Column1.managementZones", {"name"}, {"Column1.managementZones.name"}),
    #"Invoked Custom Function" = Table.AddColumn(#"Expanded Column1.managementZones1", "problemStart", each EpochToICTDateTime([Column1.startTime])),
    #"Invoked Custom Function1" = Table.AddColumn(#"Invoked Custom Function", "problemEnd", each EpochToICTDateTime([Column1.endTime])),
    #"Added Custom" = Table.AddColumn(#"Invoked Custom Function1", "Duration (min)", each ([Column1.endTime]-[Column1.startTime])/60000)
in
    #"Added Custom"

 
Hope it helps.

Regards,
Satit

Hi satit_dpm, when I created the function named getProblemAPIResult() and the rest api everything work, but when y try to save i got the error: Expression.Error: A cyclic reference was encountered during evaluation.

I think the problem is here, do you know how to fix it?

returnList = if hasNext[HasError]
then newList
else getProblemAPIResult(api, headers, parameters, apiResult[nextPageKey], newList)

Hi, 

Unfortunately, I am unable to reproduce your issue 🤔
This works fine in Excel using the same PowerQuery code I have posted.
Could you tried using the code below as-is?

(api as text, headers as record, parameters as text, nextPageKey as text, currentList as list) => 
let
    apiResult = if nextPageKey = ""
    then Json.Document(Web.Contents(api & parameters, headers))
    else Json.Document(Web.Contents(api & "?nextPageKey=" & nextPageKey, headers)),
    newList = List.Combine({currentList, apiResult[problems]}),
    hasNext = try apiResult[nextPageKey],
    returnList = if hasNext[HasError] 
    then newList
    else getProblemAPIResult(api, headers, parameters, apiResult[nextPageKey], newList)
in
    returnList



Regards,
Satit

Hi there!! I'm trying to create a dataflow in power platform. 

I solved the issue with an "@" where the function is called recursively. (reference).

But now, I have a new issue. In power platform, I could create the query but I can't save it. I got this error: "One or more tables references a dynamic data source."

In power platform I have this problem:

"This query refreshes with no problems in Power BI Desktop. However, when you publish a report that uses this code to PowerBI.com and try to refresh the dataset, you’ll see that refresh fails and returns a rather unhelpful error message:

Data source error Unable to refresh the model (id=1264553) because it references an unsupported data source.

The problem is that when a published dataset is refreshed, Power BI does some static analysis on the code to determine what the data sources for the dataset are and whether the supplied credentials are correct. Unfortunately in some cases, such as when the definition of a data source depends on the parameters from a custom M function, that static analysis fails and therefore the dataset does not refresh."

I change the code as Chris says using RelativePath and Query in web.content, but I couldn't fix it.

I tried the "skip connection" and use base_url with xxxx.live.dynatrace.com xxx.live.dynatrace.com/api , xxxx.live.dynatrace.com/api/v2 and none of then work.

So this is my code:

 

let
  getMetricsAPIResult = (base_url as text, next_url as text, qty as text, nextPageKey as text, currentList as list) => 
  let
    apiResult = if nextPageKey = ""
    then Json.Document(Web.Contents(base_url,
      [
        RelativePath = next_url,
        Query = 
          [
            pageSize = qty
          ],
          Headers=[Accept="application/json; charset=utf-8", Authorization="Api-Token XXXX"]
      ]
      ))
    else Json.Document(Web.Contents(base_url,
      [
        RelativePath = next_url,
        Query = 
          [
            nextPageKey = nextPageKey
          ],
          Headers=[Accept="application/json; charset=utf-8", Authorization="Api-Token XXXX"]
      ]
      )),
    newList = List.Combine({currentList, apiResult[metrics]}),
    hasNext_tmp = apiResult[nextPageKey], 
    hasNext = if hasNext_tmp is null 
    then try apiResult[nextPageKeyError]
    else try apiResult[nextPageKey],
    returnList = if hasNext[HasError]
    then newList
    else @getMetricsAPIResult(base_url, next_url, qty, apiResult[nextPageKey], newList)
  in
    returnList,

  consulta = getMetricsAPIResult("https://{environmentid}.live.dynatrace.com", "/api/v2/metrics", "500", "", {}),
  #"Converted to table" = Table.FromList(consulta, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
  #"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to table", "Column1", {"metricId", "displayName", "description", "unit"}, {"metricId", "displayName", "description", "unit"}),
  #"Transform columns" = Table.TransformColumnTypes(#"Expanded Column1", {{"metricId", type text}, {"displayName", type text}, {"description", type text}, {"unit", type text}}),
  #"Replace errors" = Table.ReplaceErrorValues(#"Transform columns", {{"metricId", null}, {"displayName", null}, {"description", null}, {"unit", null}})
in
  #"Replace errors"

 



Hi,

Sorry, I am not able to support on this issue as Power platform is not really my forte 😅

Regards,
Satit

@satit_dpm wanted to say Thank you.  We were able to successfully implement the solution at our company and it's bringing a entirely new value into focus using PowerBI.  I think we can finally build the Problems dashboard we've been dreaming of all these year.

HigherEd

Glad to hear this helps you 😁
All the best for your dream project

amitbhatt
Observer

https://github.com/dynatrace-oss/api-client-python how we can pass to and from client to fetch the entities

 

Featured Posts