
Say you have an application that’s logging a stringified JSON to CloudWatch logs & you have a requirement to perform some kind of analysis on this data. Here’s the JSON:
{ "clientId": "abc123", "message": "hello" }
Here’s how this JSON looks when it’s logged:
2020-06-25 INFO response from server: { "clientId": "abc123", "message": "hello" }
Suppose you want to get a count of messages received from each client. Let’s see how to go about building a query in CloudWatch Logs Insights that’ll give us this output:
|----------|----------| | clientId | count(*) | |----------|----------| | abc123 | 4 | | def456 | 3 | |----------|----------|
First of all, since every CloudWatch log event is itself a JSON object, we extract just the log messages from this JSON using:
fields @message
This gives us all the log statements:
2020-06-25 INFO calling server... 2020-06-25 INFO response from server: { "clientId": "abc123", "message": "hello" } 2020-06-25 INFO calling server... 2020-06-25 INFO response from server: { "clientId": "def456", "message": "hi" }
Now, let’s filter out the unwanted log statements:
fields @message | filter @message like 'response from server'
This leaves us with:
2020-06-25 INFO response from server: { "clientId": "abc123", "message": "hello" } 2020-06-25 INFO response from server: { "clientId": "def456", "message": "hi" }
Excellent! Next, we have to extract the client ID so we can group by it later on & count the number of messages in each group. Use the parse
command to extract the client ID:
fields @message | filter @message like 'response from server' | parse @message '"clientId": "*", "message"' as clientId
What the above parse
statement does is overlay a pattern that we specified in the single quotes, over the log message & wherever it finds a wildcard like * in the pattern, it extracts that value into the field named after “as”. That’s how we extract the client ID from each log message into the newly created field named clientId
.
Now, to count the number of log statements containing a particular client ID, use the stats
command with the count
function as shown below:
fields @message | filter @message like 'response from server' | parse @message '"clientId": "*", "message"' as clientId | stats count(*) by clientId
And voila, just like that, we have our desired output:
|----------|----------| | clientId | count(*) | |----------|----------| | abc123 | 4 | | def456 | 3 | |----------|----------|