Custom Log Types

Overview

Panther allows users to define their own log types by adding a Custom Log Type entry. Custom Log Types are identified by a Custom. prefix in their name and can be used wherever a 'native' Log Type is used:

  • You can use a Custom Log Type when onboarding data through SQS or S3

  • You can write Rules for these Log Types.

  • You can query the data in Data Explorer. Panther will create a new table for the Custom Log Type, once you onboard a source that uses it.

  • You can query the data through Indicator Search

Adding a Custom Log Type

You can add a Custom Log Type by navigating to Analysis -> Schemas and clicking on the 'New Schema' button in the upper right corner.

Here you must enter a name for the Custom Log Type (ie Custom.SampleAPI) and write or paste your YAML Log Schema definition. Use the 'Validate Syntax' button at the bottom to verify your schema contains no errors and hit 'Save'.

Note that the 'Validate Syntax' button only checks the syntax of the Log Schema. 'Save' might still fail due to name conflicts.

If all went well, you can now navigate to Log Analysis -> Sources and add either add a new source or modify an existing one to use the new Custom.SampleAPI _Log Type. Once Panther receives events from this Source, it will and process the logs and store the Log Events to the custom_sampleapi table. You can now write Rules to match against these logs and query them using the Data Explorer.

Editing a Custom Log Type

Panther allows limited editing of Custom Log Types. Specifically:

  • You can modify the parser configuration to fix bugs or add new patterns.

  • You can add new fields to the schema.

  • You cannot rename existing fields.

  • You cannot deleted existing fields (doing so would allow renaming in two steps).

  • You cannot change the type of an existing field (this includes the element type for array fields).

You can edit a Custom Log Type by clicking on the Edit action in the details page of a Custom Log Type. Modify the YAML and click Update to submit your change. Validate Syntax can check the YAML for structural compliance but the rules described above can only be checked on Update. The update will be rejected if the rules are not followed.

Deleting a Custom Log Type

A Custom Log Type can be deleted if no Source is using it.

A deleted Custom Log Type is removed from the listing and its tables are hidden from the Data Explorer view.

Deleting a Custom Log Type does not affect any data already stored in the data lake. All data is still queryable through Data Explorer or Indicator Search. However, this requires reserving the Custom Log Type name, even after it has been deleted. of a Custom Log Type even after it has been deleted. Trying to add a log with the same name at a later time, will result in failure due to the name conflict.

You can delete a Custom Log Type by clicking on the Delete action in the details page of a Custom Log Type. The action will succeed only if the Custom Log Type is not currently in use by any source.

Writing a Log Schema for JSON logs

You can make use of our CLI tool to help you generate your Log Schema

To parse log files where each line is JSON you have to define a Log Schema that describes the structure of each log entry.

For example, suppose you have logs that adhere to the following JSON structure:

{
"method": "GET",
"path": "/-/metrics",
"format": "html",
"controller": "MetricsController",
"action": "index",
"status": 200,
"params": [],
"remote_ip": "1.1.1.1",
"user_id": null,
"username": null,
"ua": null,
"queue_duration_s": null,
"correlation_id": "c01ce2c1-d9e3-4e69-bfa3-b27e50af0268",
"cpu_s": 0.05,
"db_duration_s": 0,
"view_duration_s": 0.00039,
"duration_s": 0.0459,
"tag": "test",
"time": "2019-11-14T13:12:46.156Z"
}

You can define a Log Schema for these logs using:

version: 0
fields:
- name: time
description: Event timestamp
required: true
type: timestamp
timeFormat: rfc3339
isEventTime: true
- name: method
description: The HTTP method used for the request
type: string
- name: path
description: The path used for the request
type: string
- name: remote_ip
description: The remote IP address the request was made from
type: string
indicators: [ ip ] # the value will be appended to `p_any_ip_addresses` if it's a valid ip address
- name: duration_s
description: The number of seconds the request took to complete
type: float
- name: format
description: Response format
type: string
- name: user_id
description: The id of the user that made the request
type: string
- name: params
type: array
element:
type: object
fields:
- name: key
description: The name of a Query parameter
type: string
- name: value
description: The value of a Query parameter
type: string
- name: tag
description: Tag for the request
type: string
- name: ua
description: UserAgent header
type: string

The YAML specification can either be edited directly in the Panther UI or prepared offline in your editor/IDE of choice. For more information on the structure and fields in a Log Schema, see the Log Schema Reference.

Writing a Log Schema for text logs

Panther handles logs that are not structured as JSON by using a 'parser' that translates each log line into key/value pairs and feeds it as JSON to the rest of the pipeline. You can define a text parser using the parser field of the Log Schema. Panther provides the following parsers for non-JSON formatted logs:

Name

Description

fastmatch

Match each line of text against one or more simple patterns

regex

Use regular expression patterns to handle more complex matching such as conditional fields, case-insensitive matching etc

csv

Treat log files as CSV mapping colunm names to field names

Pantherlog CLI

Panther provides a simple CLI tool to help work with Custom Logs feature. The tool is called pantherlog and an executable for each platform is provided with the release. The executables can be downloaded from the panther-community S3 bucket, see more details on the operations help page.

Generating a Schema from JSON samples

You can use the tool to generate a schema file out of sample files in new-line delimited JSON format. The tool will scan the provided logs and print the inferred schema to stdout.

For example, to infer the schema of logs sample_logs.jsonl and output to schema.yml, use:

$ ./pantherlog infer sample_logs.jsonl > schema.yml

WARNING: The tool has the following limitations:

  • It will identify a string as a timestamp, only if the string is in RFC3339 format. Make sure to review the schema after it is generated by the tool

    and identify fields that should be of type timestamp instead.

  • It will not mark any timestamp field as isEventTime:true. Make sure to select the appropriate timestamp field and mark it as isEventTime:true.

    For more information regarding isEventTime:true see timestamp.

  • It is able to infer only 3 types of indicators: ip, aws_arn, url. Make sure to review the fields and add more indicators as appropriate.

Make sure to review the schema generated and edit it appropriately before deploying to your production environment!

Trying out a Schema

You can use the tool to validate a schema file and use it to parse log files. Note that the events in the log files need to be separated by new line. Processed logs are writen to stdout and errors to stderr.

For example, to parse logs in sample_logs.jsonl with the log schema in schema.yml, use:

$ ./pantherlog parse --path schema.yml --schemas Schema.Name sample_logs.jsonl

The tool can also accept input via stdin so it can be used in a pipeline:

$ cat sample_logs.jsonl | ./pantherlog parse --path schema.yml

Running tests for a Schema

You can use the tool to run unit tests. You can define unit tests for your Custom Schema in YAML files. The format for the unit tests is described in writing parsers. To run tests defined in a schema_tests.yml file for a custom schema defined in schema.yml use:

$ ./pantherlog test schema.yml schema_tests.yml

The first argument is a file or directory containing schema YAML files. The rest of the arguments are test files to run. If you don't specify any test files arguments, and the first argument is a directory, the tool will look for tests in YAML files with a _tests.yml suffix.

Uploading Log Schemas with the Panther Analysis Tool

If you choose to maintain your log schemas outside of Panther, for example in order to keep them under version control and review changes before updating, you can upload the YAML files programmatically with the Panther Analysis Tool.

The uploader command receives a base path as an argument and then proceeds to recursively discover all files with extensions .yml and .yaml.

It is recommended to keep schema files separately from other unrelated files, otherwise you may notice several unrelated errors for attempting to upload invalid schema files.

panther_analysis_tool update-custom-schemas --path ./schemas

The uploader will check if an existing schema exists and proceed with the update or create a new one if no matching schema name is found.

The schemafield must always be defined in the YAML file and be consistent with the existing schema name for an update to succeed. For an example see here.

The uploaded files are validated with the same criteria as Web UI updates.