12 Apr 2024 04:42 PM - last edited on 15 Apr 2024 08:42 AM by MaciejNeumann
Hello!
I need to extract a large volume of data from Grail. It was not possible through the Notebook because I reached the limit of 100000 records,
I need data from the last 30 days, but I can extract it in 24-hour amounts to avoid problems.
The difficulty I am having is that, when executing my query in "/query:execute", because it is large, I receive the status "RUNNING".
Then I run "/query-poll" to receive the results. But two problems arise here. Or I get error 410, saying that the results are expired or the browser crashes (even though my computer has a good configuration), what would you recommend me to do?
My body:
{
"query": "fetch logs\n| filter dt.system.bucket==\"bucketABC\"\n| filter matchesValue(k8s.container.name, \"containerABC\") and matchesPhrase(content, \"content\")\n| parse content, \"\"\"DATA 'for customer ' SPACE? LD:CPF.passo1'\"'\"\"\"\n| fields `timestamp.passo1` = timestamp, `status.passo1` = status, `content.passo1` = content, CPF.passo1\n| lookup [fetch logs\n | filter dt.system.bucket==\"bucketABC\"\n\t | filter ((matchesValue(k8s.container.name, \"containerABC\") and matchesPhrase(content, \"content\") and matchesPhrase(content, \"content\")))\n\t | parse content, \"\"\"DATA 'customerId [' SPACE? LD:CPF.passo2']'\"\"\"\n | fields `timestamp.passo2` = timestamp, `status.passo2` = status, `content.passo2` = content, CPF.passo2], lookupField:CPF.passo2, sourceField:CPF.passo1, prefix:\"-\"\n | lookup [fetch logs\n | filter dt.system.bucket==\"bucketABC\"\n | filter ((matchesValue(k8s.container.name, \"containerABC\") and matchesPhrase(content, \"content\")))\n | parse content, \"\"\"DATA 'customerId [' SPACE? LD:CPF.passo3']'\"\"\"\n | fields `timestamp.passo3` = timestamp, `status.passo3` = status, `content.passo3` = content, CPF.passo3], lookupField:CPF.passo3, sourceField:CPF.passo1, prefix:\"--\"\n | lookup [fetch logs\n | filter dt.system.bucket==\"bucketABC\"\n | filter ((matchesValue(k8s.container.name, \"containerABC\") and (matchesPhrase(content, \"content\"))))\n | parse content, \"\"\"DATA 'customer ' SPACE? LD:CPF.passo4'\"'\"\"\"\n | fields `timestamp.passo4` = timestamp, `status.passo4` = status, `content.passo4` = content, CPF.passo4], lookupField:CPF.passo4, sourceField:CPF.passo1, prefix:\"---\"",
"defaultTimeframeStart": "2024-04-09T00:00:00.123Z",
"defaultTimeframeEnd": "2024-04-09T23:59:59.123Z",
"timezone": "GMT-3",
"locale": "en_US",
"maxResultRecords": 1000000000000,
"maxResultBytes": 1000000,
"fetchTimeoutSeconds": 600,
"requestTimeoutMilliseconds": 10000,
"enablePreview": true,
"defaultScanLimitGbytes": 500
}
}
08 Oct 2024 07:47 AM
@wellpplava
You'll need a while loop until to until you get the SUCCEEDED state
This is a python example I am using on my app:
def get_results(bearer_token, requestToken):
try:
if bearer_token:
url = 'https://{environmentid}.apps.dynatrace.com/platform/storage/query/v1/query:poll'
headers = {
"accept": "application/json",
"Content-Type": "application/json",
"Authorization": f"Bearer {bearer_token}"
}
params = {
'request-token': requestToken,
'request-timeout-milliseconds': '60',
'enrich': 'metric-metadata',
}
response = requests.get(url, params=params, headers=headers)
while(response.json()['state'] == 'RUNNING'):
print(
f"Status: {response.json()['state']}\n"
f" Progress: {response.json()['progress']}\n"
f" Seconds running: {response.json()['ttlSeconds']}\n"
f"Trying in 2 sec...\n"
)
sleep(2)
response = requests.get(url, params=params, headers=headers)
if(response.json()['state'] == 'SUCCEEDED'):
print(
f"Status: {response.status_code}\n"
f"State: {response.json()['state']}\n"
f"Returned records: {str(response.json()['result']['records'])[:50]}"
)
return response.json()['result']['records']
else:
print(
f"Something is not right!\n"
f"Status: {response.status_code}\n"
f"{response.json()['error']['details']['errorMessage']}\n"
f"{response.json()['error']['details']['errorType']}"
)
return response.json()['result']['records']
else:
print("Failed to retrieve bearer token.")
except Exception as e:
print(f"Error: {str(e)}")
return None
You'll need the bearertoken to perform the query and the request token that is returned when are starting the query.
On top of that, I suggest increasing the bytes that you are retuning (at least for my use case, I need big chunks of data).
data = {
"query": query,
# "defaultTimeframeStart": start_date,
# "defaultTimeframeEnd": end_date,
"timezone": timezone,
"locale": region,
"maxResultRecords": 1000000,
"maxResultBytes": 100000000,
"fetchTimeoutSeconds": 6000,
"requestTimeoutMilliseconds": 1000,
"enablePreview": False,
"defaultScanLimitGbytes": 10000
}