Say you have an application that’s logging a stringified JSON to CloudWatch logs & you have a requirement to perform some kind of analysis on this data. Here’s the JSON:
{
"clientId": "abc123",
"message": "hello"
}
Here’s how this JSON looks when it’s logged:
2020-06-25 INFO response from server: { "clientId": "abc123", "message": "hello" }
Suppose you want to get a count of messages received from each client. Let’s see how to go about building a query in CloudWatch Logs Insights that’ll give us this output:
|----------|----------|
| clientId | count(*) |
|----------|----------|
| abc123 | 4 |
| def456 | 3 |
|----------|----------|
First of all, since every CloudWatch log event is itself a JSON object, we extract just the log messages from this JSON using:
fields @message
This gives us all the log statements:
2020-06-25 INFO calling server...
2020-06-25 INFO response from server: { "clientId": "abc123", "message": "hello" }
2020-06-25 INFO calling server...
2020-06-25 INFO response from server: { "clientId": "def456", "message": "hi" }
Now, let’s filter out the unwanted log statements:
fields @message |
filter @message like 'response from server'
This leaves us with:
2020-06-25 INFO response from server: { "clientId": "abc123", "message": "hello" }
2020-06-25 INFO response from server: { "clientId": "def456", "message": "hi" }
Excellent! Next, we have to extract the client ID so we can group by it later on & count the number of messages in each group. Use the parse
command to extract the client ID:
fields @message |
filter @message like 'response from server' |
parse @message '"clientId": "*", "message"' as clientId
What the above parse
statement does is overlay a pattern that we specified in the single quotes, over the log message & wherever it finds a wildcard like * in the pattern, it extracts that value into the field named after “as”. That’s how we extract the client ID from each log message into the newly created field named clientId
.
Now, to count the number of log statements containing a particular client ID, use the stats
command with the count
function as shown below:
fields @message |
filter @message like 'response from server' |
parse @message '"clientId": "*", "message"' as clientId |
stats count(*) by clientId
And voila, just like that, we have our desired output:
|----------|----------|
| clientId | count(*) |
|----------|----------|
| abc123 | 4 |
| def456 | 3 |
|----------|----------|