Custom Log Types

Overview

Panther allows users to define their own log types by adding a Custom Log Type entry. Custom Log Types are identified by a Custom. prefix in their name and can be used wherever a 'native' Log Type is used:

  • You can use a Custom Log Type when onboarding data through SQS or S3

  • You can write Rules for these Log Types.

  • You can query the data in Data Explorer. Panther will create a new table for the Custom Log Type, once you onboard a source that uses it.

  • You can query the data through Indicator Search

Limitations

Panther currently only supports JSON logs as input to Custom Log Types. Support for other log formats (i.e. CSV, text) will be added in the future.

Currently Panther does not allow editing of Custom Log Types. Changing the schema of a Log Type requires extensive checks and possibly a migration of log tables. This will be allowed in the future once all corner cases are dealt with.

The only editing action allowed for a Custom Log Type is 'Delete' and this only succeeds if no Source is using it. Deleting a Custom Log Type does not affect any data already stored in the data lake. All data is still queryable through Data Explorer or Indicator Search. To provide this, Panther has to reserve the name of a Custom Log Type even after it has been deleted. Adding another one with the same name will fail with a conflict.

Writing a Log Schema

Custom Log Types include a Log Schema describing the fields of a Log Event in YAML.

For example, suppose you have logs that adhere to the following JSON structure:

{
"method": "GET",
"path": "/-/metrics",
"format": "html",
"controller": "MetricsController",
"action": "index",
"status": 200,
"params": [],
"remote_ip": "1.1.1.1",
"user_id": null,
"username": null,
"ua": null,
"queue_duration_s": null,
"correlation_id": "c01ce2c1-d9e3-4e69-bfa3-b27e50af0268",
"cpu_s": 0.05,
"db_duration_s": 0,
"view_duration_s": 0.00039,
"duration_s": 0.0459,
"tag": "test",
"time": "2019-11-14T13:12:46.156Z"
}

You can define a Log Schema for these logs using:

version: 0
fields:
- name: time
description: Event timestamp
required: true
type: timestamp
timeFormat: rfc3339
isEventTime: true
- name: method
description: The HTTP method used for the request
type: string
- name: path
description: The path used for the request
type: string
- name: remote_ip
description: The remote IP address the request was made from
type: string
indicators: [ ip ] # the value will be appended to `p_any_ip_addresses` if it's a valid ip address
- name: duration_s
description: The number of seconds the request took to complete
type: float
- name: format
description: Response format
type: string
- name: user_id
description: The id of the user that made the request
type: string
- name: params
type: array
element:
type: object
fields:
- name: key
description: The name of a Query parameter
type: string
- name: value
description: The value of a Query parameter
type: string
- name: tag
description: Tag for the request
type: string
- name: ua
description: UserAgent header
type: string

Such a YAML file can either be input directly in the Panther UI or prepared offline in your editor/IDE of choice. For more information on the structure and fields in a Log Schema, see the reference section.

Testing a Log Schema

Panther provides a simple CLI tool to validate and test a log schema YAML file. The tool is called customlogs and an executable for each platform is provided with the release. The tool validates a schema file and uses it to parse log files in new-line delimited JSON format. Processed logs are writen to stdout and errors to stderr.

For example, to parse logs in sample_logs.jsonl with the log schema in schema.yml, use:

$ ./customlogs -s schema.yml sample_logs.jsonl

The tool can also accept input via stdin so it can be used in a pipeline:

$ cat sample_logs.jsonl | ./customlogs -s schema.yml

Adding a Custom Log Type

You can add a Custom Log Type by navigating to Log Analysis -> Custom Logs and clicking on the 'New Schema' button in the upper right corner.

Create custom logs screen

Here you must enter a name for the Custom Log Type (ie Custom.SampleAPI) and write or paste your YAML Log Schema definition. Use the 'Validate Syntax' button at the bottom to verify your schema contains no errors and hit 'Save'.

Note that the 'Validate Syntax' button only checks the syntax of the Log Schema. 'Save' might still fail due to name conflicts.

If all went well, you can now navigate to Log Analysis -> Sources and add either add a new source or modify an existing one to use the new Custom.SampleAPI _Log Type. Once Panther receives events from this Source, it will and process the logs and store the Log Events to the custom_sampleapi table. You can now write Rules to match against these logs and query them using the Data Explorer.

Log Schema Reference

LogSchema fields

  • version (0,required): The version of the log schema. This field should be set to zero (0). Its purpose is to allow backwards compatibility with future versions of the log schema.

  • fields ([]FieldSchema, required): The fields in each Log Event.

Example

version: 0
fields:
- name: action
type: string
required: true
- name: time
type: timestamp
timeFormat: unix

FieldSchema

A FieldSchema defines a field and its value. The field is defined by:

  • name (String,required): The name of the field.

  • required (Boolean): If the field is required or not.

  • description (String): Some text documenting the field.

And its value is defined using the fields of a ValueSchema.

ValueSchema

A ValueSchema defines a value and how it should be processed. Each ValueSchema has a type field that can be of the following values:

Value Type

Description

object

A JSON object

array

A JSON array where each element is of the same type

timestamp

A timestamp value

string

A string value

int

A 32-bit integer number in the range -2147483648, 2147483647

smallint

A 16-bit integer number in the range -32768, 32767

bigint

A 64-bit integer number in the range -9223372036854775808, 9223372036854775807

float

A 64-bit floating point number

boolean

A boolean value true / false

json

A raw JSON value

The fields of a ValueSchema depend on the value of the type field.

Type

Field

Value

Description

object

fields (required)

[]FieldSpec

An array of FieldSpec objects describing the fields of the object.

array

element (required)

ValueSchema

A ValueSchema describing the elements of an array.

timestamp

timeFormat (required)

String

The format to use for parsing the timestamp. (see Timestamps)

timestamp

isEventTime

Boolean

A flag to tell Panther to use this timestamp as the Log Event Timestamp.

string

indicators

[]String

Tells Panther to extract indicators from this value (see Indicators)

Timestamps

Timestamps are defined by setting the type field to timestamp and specifying the timestamp format using the timeFormat field. Timestamp formats can be either one of the built-in timestamp formats:

  • rfc3339 The most common timestamp format.

  • unix Timestamp expressed in seconds since UNIX epoch time. It can handle fractions of seconds as a decimal part.

  • unix_ms Timestamp expressed in milliseconds since UNIX epoch time.

  • unix_us Timestamp expressed in microseconds since UNIX epoch time.

  • unix_ns Timestamp expressed in nanoseconds since UNIX epoch time.

or you can define a custom format by using strftime notation. For example:

# The field is a timestmap using a custom timestamp format like "2020-09-14 14:29:21"
- name: ts
type: timestamp
timeFormat: "%Y-%m-%d %H:%M:%S" # note the quotes required for proper YAML syntax

Timestamp values can also be marked with isEventTime: true to tell Panther that it should use this timestamp as the p_event_time field. It is possible to set isEventTime on multiple fields. This covers the cases where some logs have optional or mutually exclusive fields holding event time information. Since there can only be a single p_event_time for every Log Event, the priority is defined using the order of fields in the schema.

Indicators

Values of string type can be used as indicators. To mark a field as an indicator you must set the indicators field to an array of indicator scanner names. This will instruct Panther to parse the string and store any indicator values it finds to the relevant field. For example:

# Will scan the value as IP address and store it to `p_any_ip_addresses`
- name: remote_ip
type: string
indicators: [ ip ]
# Will scan the value as URL and store domain name or ip address to `p_any_domain_names` or `p_any_ip_addresses`
- name: target_url
type: string
indicators: [ url ]

Using JSON Schema in an IDE

If your editor/IDE supports JSON Schema, you can use this JSON Schema file for validation and autocompletion.

JetBrains

You can find instruction on how to configure JetBrains IDEs to use custom JSON Schemas here

VSCode

You can find instructions on how to configure VSCode to use JSON Schema here.