The open-source security lake platform for AWS

Overview

The open-source security lake platform for AWS.


Twitter

What is Matano?

Matano is an open source security lake platform for AWS. It lets you ingest petabytes of security and log data from various sources, store and query them in a data lake, and create Python detections as code for realtime alerting. Matano is fully serverless and designed specifically for AWS and focuses on enabling high scale, low cost, and zero-ops. Matano deploys fully into your AWS account.


Features

Collect data from all your sources

Matano lets you collect log data from sources using S3 or Kafka based ingestion.

Ingest, transform, normalize log data

Matano normalizes and transforms your data using VRL. Matano works with the Elastic Common Schema by default and you can define your own schema.

Store data in S3 object storage

Log data is always stored in S3 object storage, for cost effective, long term, durable storage.

Apache Iceberg Data lake

All data is ingested into an Apache Iceberg based data lake, allowing you to perform ACID transactions, time travel, and more on all your log data.

Serverless

Matano is a fully serverless platform, designed for zero-ops and unlimited elastic horizontal scaling.

Detections as code

Write Python detections to implement realtime alerting on your log data.

Installing

You can install the matano CLI to deploy Matano into your AWS account, and manage your Matano deployment.

Requirements

  • node>=12 and npm
  • Docker

From source

You can manually install from source.

git clone https://github.com/matanolabs/matano.git
make install

License

Comments
  • Transform error with client.geo.location

    Transform error with client.geo.location

    I'm creating a log source for Okta logs and am struggling to transform log data to the ECS fields client.geo.location.lat and client.geo.location.lon. With the VRL below, I consistently get the error "USER_ERROR: Failed at FindUnionVariant, likely schema issue." in the transformer Lambda. I have pretty much every other Okta log field working.

    Looking at the ECS schema JSON, both lat and lon are defined as floats, so this should work.

    Relevant VRL transform: .client.geo.location.lat = to_float(del(.json.client.geographicalContext.geolocation.lat)) ?? null .client.geo.location.lon = to_float(del(.json.client.geographicalContext.geolocation.lon)) ?? null

    Relevant log data:

    {
        "json": {
            "client": {
                "geographicalContext": {
                    "city": "Ashburn",
                    "country": "United States",
                    "geolocation": {
                        "lat": 39.0469,
                        "lon": -77.4903
                    },
                    "postalCode": "20149",
                    "state": "Virginia"
                }
            }
    }
    

    Any assistance identifying the issue or bug would be appreciated.

    Thanks.

    opened by gdrapp 19
  • SQS Ingestion CDK Modifications

    SQS Ingestion CDK Modifications

    The reasons for the naming difference on line 155 - versus the normal pattern on 156 - of DPMainStack is worth noting, as it vital to passing metadata to the Transformer lambda. Here's why -

    Consider the following queue names that were generated in testing:

    • MatanoDPMainStack-zeekhttpMatanoSQSSourceZeekHttpIngestQueue
    • MatanoDPMainStack-zeekdnsMatanoSQSSourceZeekDnsIngestionQueue

    Notice that right after “-“, the log_source.name and a table from that log_source is joined together as lower case chars. In the lambda code, extract the substr between the “-“ and the “M” in Matano to get necessary metadata from an SQS message (see 1 below).

    The current limitations / workarounds:

    1. Not able to match against any other metadata in the SQS message that can tell the Transformer lambda to discern S3 from SQS other than event_source_arn. This is because we cannot rely on current log producers (vector.dev, fluentd, file system forwarders, datadog, etc) to support SQS / message_attributes.
    2. Must create the metadata as early as possible in the queue name for both efficiency and because the max number of chars in a queue name is 80. Since MatanoDPMainStack- is 18 chars, log source and table name for sqs is limited to 62 chars.
    opened by kai-ten 7
  • [Bug] Specified ReservedConcurrentExecutions for function decreases account's UnreservedConcurrentExecution below its minimum value of [x]

    [Bug] Specified ReservedConcurrentExecutions for function decreases account's UnreservedConcurrentExecution below its minimum value of [x]

    I have been trying to run matano in a fresh personal AWS account after having it tried it an another account with extended lambda limits to see if there exists any additional configuration / request for quota increase. I hit upon this error with matano.

    Details below.

    Version : matano/0.0.0 linux-x64 node-v14.18.1 Note: This is the nightly build as of today.

    Snippet Error from Terminal Below :

    rams3sh@monastery:~/Garage/matano$ matano init
    ━━━ Matano: Get started Wizard ━━━
    
    Welcome to the Matano init wizard. This will get you started with Matano.
    Follow the prompts to get started. You can always change these values later.
    
    ✔ Which AWS Region to deploy to? · us-east-1
    ✔ What is the AWS Account ID to deploy to? · XXXXXXXXXXXXX
    ✔ Do you have an existing matano directory? (y/N) · false
      I will generate a Matano directory in the current directory.
    ✔ What is the name of the directory to generate?(use . for current directory) · .
    ✔ Generated Matano directory at /home/rams3sh/Garage/matano.
    ✔ Successfully initialized your account.
    ⠦ Now deploying Matano to your AWS account... 
    ›   Error: An error occurred: Command failed with exit code 1: /usr/local/matano-cli/cdk deploy DPMainStack --require-approval never --app /usr/local/matano-cli/matano-cdk 
    ...
    
     ›   Failed resources:
     ›   MatanoDPMainStack | 7:51:57 PM | CREATE_FAILED        | AWS::Lambda::Function            | DPMainStack/LakeWriter/AlertsFunction (LakeWriterAlertsFunctionCB567D9B) 
     ›   Resource handler returned message: "Specified ReservedConcurrentExecutions for function decreases account's UnreservedConcurrentExecution below its minimum value of [50].
     ›    (Service: Lambda, Status Code: 400, Request ID: c990af9b-a3e6-4328-a3c7-4f0b01967c4f)" (RequestToken: b6eb4fad-441b-2493-86c5-5c29b6969a6f, HandlerErrorCode: 
     ›   InvalidRequest)
     ›   
     ›    ❌  DPMainStack (MatanoDPMainStack) failed: Error: The stack named MatanoDPMainStack failed creation, it may need to be manually deleted from the AWS console: 
     ›   ROLLBACK_COMPLETE: Resource handler returned message: "Specified ReservedConcurrentExecutions for function decreases account's UnreservedConcurrentExecution below its 
     ›   minimum value of [50]. (Service: Lambda, Status Code: 400, Request ID: c990af9b-a3e6-4328-a3c7-4f0b01967c4f)" (RequestToken: b6eb4fad-441b-2493-86c5-5c29b6969a6f, 
     ›   HandlerErrorCode: InvalidRequest)
     ›       at FullCloudFormationDeployment.monitorDeployment (/snapshot/node_modules/aws-cdk/lib/api/deploy-stack.ts:505:13)
     ›       at runMicrotasks (<anonymous>)
     ›       at processTicksAndRejections (internal/process/task_queues.js:95:5)
     ›       at deployStack2 (/snapshot/node_modules/aws-cdk/lib/cdk-toolkit.ts:265:24)
     ›       at /snapshot/node_modules/aws-cdk/lib/deploy.ts:39:11
     ›       at run (/snapshot/node_modules/p-queue/dist/index.js:163:29)
     ›   
     ›    ❌ Deployment failed: Error: Stack Deployments Failed: Error: The stack named MatanoDPMainStack failed creation, it may need to be manually deleted from the AWS console:
     ›    ROLLBACK_COMPLETE: Resource handler returned message: "Specified ReservedConcurrentExecutions for function decreases account's UnreservedConcurrentExecution below its 
     ›   minimum value of [50]. (Service: Lambda, Status Code: 400, Request ID: c990af9b-a3e6-4328-a3c7-4f0b01967c4f)" (RequestToken: b6eb4fad-441b-2493-86c5-5c29b6969a6f, 
     ›   HandlerErrorCode: InvalidRequest)
     ›       at deployStacks (/snapshot/node_modules/aws-cdk/lib/deploy.ts:61:11)
     ›       at runMicrotasks (<anonymous>)
     ›       at processTicksAndRejections (internal/process/task_queues.js:95:5)
     ›       at CdkToolkit.deploy (/snapshot/node_modules/aws-cdk/lib/cdk-toolkit.ts:339:7)
     ›       at initCommandLine (/snapshot/node_modules/aws-cdk/lib/cli.ts:374:12)
     ›
     ›   Stack Deployments Failed: Error: The stack named MatanoDPMainStack failed creation, it may need to be manually deleted from the AWS console: ROLLBACK_COMPLETE: Resource 
     ›   handler returned message: "Specified ReservedConcurrentExecutions for function decreases account's UnreservedConcurrentExecution below its minimum value of [50]. 
     ›   (Service: Lambda, Status Code: 400, Request ID: c990af9b-a3e6-4328-a3c7-4f0b01967c4f)" (RequestToken: b6eb4fad-441b-2493-86c5-5c29b6969a6f, HandlerErrorCode: 
     ›   InvalidRequest)
     ›   Created temporary directory for configuration files: /tmp/mtnconfigv9yADs/config
     ›   arn:aws:cloudformation:us-east-1:XXXXXXXXXX:stack/MatanoDPCommonStack/cebd94d0-7e14-11ed-9855-0e5a30013c2f
    
    
    

    Lambda Quotas :

    rams3sh@monastery:~/Garage/matano$ aws lambda get-account-settings
    {
        "AccountLimit": {
            "TotalCodeSize": 80530636800,
            "CodeSizeUnzipped": 262144000,
            "CodeSizeZipped": 52428800,
            "ConcurrentExecutions": 50,
            "UnreservedConcurrentExecutions": 50
        },
        "AccountUsage": {
            "TotalCodeSize": 1337,
            "FunctionCount": 1
        }
    }
    

    Please let me know how to proceed from here.

    Also, should I have to mandatory increase the lambda quota since it has a separate pricing ? Can there be any option not to have this concurrency enabled as part of matano deployment ? This will be helpful for experimentation use cases such as the current scenario like mine where I don't expect to have production scale events.

    Further, such cases (in general) can be part of some kind of cli argument where the user has an option to explicitly disable such recommended production settings which may not be required for a staging / experimentation.

    opened by rams3sh 5
  • Error parsing compressed file containing Cloudwatch event

    Error parsing compressed file containing Cloudwatch event

    Hello,

    I ran into this issue while testing matano on some sample log files. TransformerLambda fails with the message: thread 'main' panicked at 'calledResult::unwrap()on anErrvalue: stream did not contain valid UTF-8', transformer/src/main.rs:538:58

    The file that I want to parse is delivered by Kinesis Firehose and it is Cloudtrail logs streamed from Cloudwatch to S3. It doesn't have an extension and content-type is marked as 'application/octet-stream'. Inside there is JSON file represting Cloudwatch event. Important note on that type of file can found here: https://docs.aws.amazon.com/firehose/latest/dev/writing-with-cloudwatch-logs.html. "CloudWatch log events are compressed with gzip level 6. If you want to specify OpenSearch Service or Splunk as the destination for the delivery stream, use a Lambda function to uncompress the records to UTF-8.and single-line JSON" I suspect that maybe some additonal parsing is required for that type of file.

    opened by marcin-kwasnicki 4
  • Config generates SQS ingestion if enabled

    Config generates SQS ingestion if enabled

    A few thoughts so far:

    • Current impl does not check for both log sources being enabled at once
      • In a way it's a feature because it allows for multiple ingestion types for a log source (not sure why someone would do that, but it'd be possible)
      • Downside is duplicate data occurs if the same logs get sent to multiple ingestion types
    • In infra/lib/DPMainStack.ts, I think iterating over logSources outside of the S3/SQS Constructs (line 136 & 141) so that both log sources share the same loop would be better
      • This also sets the stage for adding in more ingestion ingestion types (Kafka, ddb/kinesis streams, SNS)
    • Have not completed an E2E test for SQS ingestion yet (or any necessary impl)
    • I added "enabled" to s3_source, but this would be a breaking change, so for now S3 is default if nothing is specified

    Also - was caught up on trying to see if managed Matano S3 bucket is working - I don't see /data/<log_source> getting generated when not defining a custom bucket.. is that expected?

    opened by kai-ten 3
  • Init Fails with: Resource handler returned message:

    Init Fails with: Resource handler returned message: "Invalid request provided: Queue visibility timeout: 30 seconds is less than Function timeout: 60 seconds

    CLI Version (installed via docs today)

    mfranz@pixel-slate-cros:~/matano$  matano --version
    matano/0.0.0 linux-x64 node-v14.18.1
    mfranz@pixel-slate-cros:~/matano$ md5sum /usr/local/bin/matano
    ca5dbebd474f92dd3448bc54398b93b2  /usr/local/bin/matano
    

    Logs

    
     ›   MatanoDPMainStack |  99/104 | 10:27:12 AM | CREATE_COMPLETE      | AWS::Lambda::Function            | DPMainStack/Transformer/Function 
     ›   (TransformerFunctionFE009084) 
     ›   MatanoDPMainStack | 100/104 | 10:27:13 AM | CREATE_COMPLETE      | AWS::Lambda::Function            | DPMainStack/LakeWriter/Function (LakeWriterFunctionF773435F)
     › 
     ›   MatanoDPMainStack | 100/104 | 10:27:15 AM | CREATE_IN_PROGRESS   | AWS::Lambda::EventSourceMapping  | 
     ›   DPMainStack/LakeWriter/AlertsFunction/SqsEventSource:DPMainStackAlertsDefaultTableLakeWriterQueueC3CE4805 
     ›   (LakeWriterAlertsFunctionSqsEventSourceDPMainStackAlertsDefaultTableLakeWriterQueueC3CE4805B599AD6D) 
     ›   MatanoDPMainStack | 100/104 | 10:27:16 AM | CREATE_IN_PROGRESS   | AWS::Lambda::EventSourceMapping  | 
     ›   DPMainStack/Transformer/Function/SqsEventSource:DPMainStackDataBatcherOutputQueueD9616F88 
     ›   (TransformerFunctionSqsEventSourceDPMainStackDataBatcherOutputQueueD9616F888667E4CB) 
     ›   MatanoDPMainStack | 100/104 | 10:27:18 AM | CREATE_IN_PROGRESS   | AWS::Lambda::EventSourceMapping  | 
     ›   DPMainStack/LakeWriter/AlertsFunction/SqsEventSource:DPMainStackAlertsDefaultTableLakeWriterQueueC3CE4805 
     ›   (LakeWriterAlertsFunctionSqsEventSourceDPMainStackAlertsDefaultTableLakeWriterQueueC3CE4805B599AD6D) Resource creation Initiated
     ›   MatanoDPMainStack | 100/104 | 10:27:18 AM | CREATE_IN_PROGRESS   | AWS::Lambda::EventSourceMapping  | 
     ›   DPMainStack/LakeWriter/Function/SqsEventSource:DPMainStackMatanoLogstestlogsourceDefaultTableLakeWriterQueueC1E4E04B 
     ›   (LakeWriterFunctionSqsEventSourceDPMainStackMatanoLogstestlogsourceDefaultTableLakeWriterQueueC1E4E04BF71C3721) 
     ›   MatanoDPMainStack | 100/104 | 10:27:18 AM | CREATE_FAILED        | AWS::Lambda::EventSourceMapping  | 
     ›   DPMainStack/Transformer/Function/SqsEventSource:DPMainStackDataBatcherOutputQueueD9616F88 
     ›   (TransformerFunctionSqsEventSourceDPMainStackDataBatcherOutputQueueD9616F888667E4CB) Resource handler returned message: "Invalid request provided: Queue 
     ›   visibility timeout: 30 seconds is less than Function timeout: 60 seconds (Service: Lambda, Status Code: 400, Request ID: bc350930-2f6e-4f4c-9b68-809dd098f9c7)" 
     ›   (RequestToken: fb617621-26d6-2dd5-6739-1c4b5762885b, HandlerErrorCode: InvalidRequest)
     ›   MatanoDPMainStack | 100/104 | 10:27:19 AM | CREATE_FAILED        | AWS::Lambda::EventSourceMapping  | 
     ›   DPMainStack/LakeWriter/AlertsFunction/SqsEventSource:DPMainStackAlertsDefaultTableLakeWriterQueueC3CE4805 
     ›   (LakeWriterAlertsFunctionSqsEventSourceDPMainStackAlertsDefaultTableLakeWriterQueueC3CE4805B599AD6D) Resource creation cancelled
     ›   MatanoDPMainStack | 100/104 | 10:27:20 AM | CREATE_FAILED        | AWS::Lambda::EventSourceMapping  | 
     ›   DPMainStack/LakeWriter/Function/SqsEventSource:DPMainStackMatanoLogstestlogsourceDefaultTableLakeWriterQueueC1E4E04B 
     ›   (LakeWriterFunctionSqsEventSourceDPMainStackMatanoLogstestlogsourceDefaultTableLakeWriterQueueC1E4E04BF71C3721) Resource creation cancelled
     ›   MatanoDPMainStack | 100/104 | 10:27:22 AM | ROLLBACK_IN_PROGRESS | AWS::CloudFormation::Stack       | MatanoDPMainStack The following resource(s) failed to 
     ›   create: [TransformerFunctionSqsEventSourceDPMainStackDataBatcherOutputQueueD9616F888667E4CB, 
     ›   LakeWriterFunctionSqsEventSourceDPMainStackMatanoLogstestlogsourceDefaultTableLakeWriterQueueC1E4E04BF71C3721, 
     ›   LakeWriterAlertsFunctionSqsEventSourceDPMainStackAlertsDefaultTableLakeWriterQueueC3CE4805B599AD6D]. Rollback requested by user.
    

    Error Message

     ›   Failed resources:
     ›   MatanoDPMainStack | 10:27:18 AM | CREATE_FAILED        | AWS::Lambda::EventSourceMapping  | 
     ›   DPMainStack/Transformer/Function/SqsEventSource:DPMainStackDataBatcherOutputQueueD9616F88 
     ›   (TransformerFunctionSqsEventSourceDPMainStackDataBatcherOutputQueueD9616F888667E4CB) Resource handler returned message: "Invalid request provided: Queue 
     ›   visibility timeout: 30 seconds is less than Function timeout: 60 seconds (Service: Lambda, Status Code: 400, Request ID: bc350930-2f6e-4f4c-9b68-809dd098f9c7)" 
     ›   (RequestToken: fb617621-26d6-2dd5-6739-1c4b5762885b, HandlerErrorCode: InvalidRequest)
     ›   
     ›    ❌  DPMainStack (MatanoDPMainStack) failed: Error: The stack named MatanoDPMainStack failed creation, it may need to be manually deleted from the AWS console: 
     ›   ROLLBACK_COMPLETE: Resource handler returned message: "Invalid request provided: Queue visibility timeout: 30 seconds is less than Function timeout: 60 seconds 
     ›   (Service: Lambda, Status Code: 400, Request ID: bc350930-2f6e-4f4c-9b68-809dd098f9c7)" (RequestToken: fb617621-26d6-2dd5-6739-1c4b5762885b, HandlerErrorCode: 
     ›   InvalidRequest)
     ›       at FullCloudFormationDeployment.monitorDeployment (/snapshot/node_modules/aws-cdk/lib/api/deploy-stack.ts:505:13)
     ›       at processTicksAndRejections (internal/process/task_queues.js:95:5)
     ›       at deployStack2 (/snapshot/node_modules/aws-cdk/lib/cdk-toolkit.ts:265:24)
     ›       at /snapshot/node_modules/aws-cdk/lib/deploy.ts:39:11
     ›       at run (/snapshot/node_modules/p-queue/dist/index.js:163:29)
     ›   
     ›    ❌ Deployment failed: Error: Stack Deployments Failed: Error: The stack named MatanoDPMainStack failed creation, it may need to be manually deleted from the AWS 
     ›   console: ROLLBACK_COMPLETE: Resource handler returned message: "Invalid request provided: Queue visibility timeout: 30 seconds is less than Function timeout: 60 
     ›   seconds (Service: Lambda, Status Code: 400, Request ID: bc350930-2f6e-4f4c-9b68-809dd098f9c7)" (RequestToken: fb617621-26d6-2dd5-6739-1c4b5762885b, 
     ›   HandlerErrorCode: InvalidRequest)
     ›       at deployStacks (/snapshot/node_modules/aws-cdk/lib/deploy.ts:61:11)
     ›       at processTicksAndRejections (internal/process/task_queues.js:95:5)
     ›       at CdkToolkit.deploy (/snapshot/node_modules/aws-cdk/lib/cdk-toolkit.ts:339:7)
     ›       at initCommandLine (/snapshot/node_modules/aws-cdk/lib/cli.ts:374:12)
     ›
     ›   Stack Deployments Failed: Error: The stack named MatanoDPMainStack failed creation, it may need to be manually deleted from the AWS console: ROLLBACK_COMPLETE: 
     ›   Resource handler returned message: "Invalid request provided: Queue visibility timeout: 30 seconds is less than Function timeout: 60 seconds (Service: Lambda, 
     ›   Status Code: 400, Request ID: bc350930-2f6e-4f4c-9b68-809dd098f9c7)" (RequestToken: fb617621-26d6-2dd5-6739-1c4b5762885b, HandlerErrorCode: InvalidRequest)
     ›   Created temporary directory for configuration files: /tmp/mtnconfignGHkpS/config
    
    opened by mdfranz 2
  • enrichment: Read CSV and translate into a vector Value Closes matanolabs/matano#27

    enrichment: Read CSV and translate into a vector Value Closes matanolabs/matano#27

    Not ready to ship yet, have some refactoring to do such as:

    • Think of any better ways to build the serde_json::Map without iterating over the headers for every record?
    • Other notes have been left in the code, please do comment on those if you see obvious solutions there
    • Left out the cargo.toml changes / imports when switching from matano clone to my fork.. those would've helped the build to compile successfully..

    Still have to test the CSV impl in general.. a task for tomorrow.. JSON works as expected in it's current state though.


    The logic: When reading a CSV line, you get back either a StringRecord or a ByteRecord. StringRecords don't convert easily back to a str/String, as it is an array of columns. ByteRecord requires a struct to easily deserialize, something we won't know ahead of time. That left me with iterating over each header and each line of data in the CSV, creating a serde_json::Map, converting the map to a serde_json::Value, converting the serde_json::Value to a String, then passing that String into the vector Value.


    If you have other feedback, please do share - I'll keep refactoring until then.

    opened by kai-ten 2
  • Add option to Use VPCs in lambdas if specified by user

    Add option to Use VPCs in lambdas if specified by user

    Overview

    We currently let the user define a VPC id in their matano.config.yml like so:

    vpc:
      id: vpc-05175918865d89771
    

    However, we don't currently use the VPC in all the generated resources.

    Goal

    If the user specifies a VPC ID in their config, use the VPC when generating all resources.

    Relevant resource currently is just Lambda functions.

    Notes

    • Matano uses CDK context to cache the VPC info. You can access the VPC info inside a CDK stack like so, which will be defined if the user specified a VPC in their config:
    const vpc: cdk.aws_ec2.IVpc | undefined = (cdk.Stack.of(this) as MatanoStack).matanoVpc;
    
    • If user doesn't specify a VPC, can just not use any VPC for now.
    • Possibly look into using CDK aspects to simplify.
    opened by Samrose-Ahmed 2
  • Generic equivalent?

    Generic equivalent?

    This is not really an "issue". I just want to thank you for open-sourcing this interesting project. I have been thinking on the same lines but a vendor neutral alternative. Do you think there could be an vendor neutral equivalent for this project? Something that can be deployed across cloud providers as well bare-metal (Kubernetes)? Do you think there are equivalent alternatives for the AWS components that can be replaced with CNCF projects and/or FOSS projects? That would be really awesome and would probably have a much wider adoption IMHO.

    Its perfectly fine though if you want to be AWS specific. :) Happy to chat further.

    opened by dpnishant 2
  • "Error: command bootstrap not found" when bootstrapping the AWS account

    I am running Ubuntu 20.04 on Windows via WSL 1.

    I have installed node.js v12.22.12 via the Node Version Manager per https://www.digitalocean.com/community/tutorials/how-to-install-node-js-on-ubuntu-20-04.

    I have installed the Matano CLI and created the configuration directory.

    When I run matano bootstrap, I get the following error:

    xenophonf@l0000000d:~/src/matano/my-matano-config$ matano bootstrap
    (node:24682) SyntaxError Plugin: matano: Unexpected token '.'
    module: @oclif/[email protected]
    task: toCached
    plugin: matano
    root: /home/xenophonf/src/matano/cli
    See more details with DEBUG=*
    (node:24682) SyntaxError Plugin: matano: Unexpected token '.'
    module: @oclif/[email protected]
    task: toCached
    plugin: matano
    root: /home/xenophonf/src/matano/cli
    See more details with DEBUG=*
    (node:24682) SyntaxError Plugin: matano: Unexpected token '?'
    module: @oclif/[email protected]
    task: toCached
    plugin: matano
    root: /home/xenophonf/src/matano/cli
    See more details with DEBUG=*
     ›   Error: command bootstrap not found
    
    opened by niheconomoum 2
  • "make install" does not install the Matano CLI independent of its source code

    I am running Ubuntu 20.04 on Windows via WSL 1.

    I have installed node.js v12.22.12 via the Node Version Manager per https://www.digitalocean.com/community/tutorials/how-to-install-node-js-on-ubuntu-20-04.

    After installing the Matano CLI per https://www.matano.dev/docs/installation, I cannot remove my copy of the source code without breaking the installation of the Matano CLI:

    xenophonf@l0000000d:~/src$ git clone https://github.com/matanolabs/matano.git
    Cloning into 'matano'...
    remote: Enumerating objects: 4324, done.
    remote: Counting objects: 100% (1119/1119), done.
    remote: Compressing objects: 100% (518/518), done.
    remote: Total 4324 (delta 618), reused 931 (delta 488), pack-reused 3205
    Receiving objects: 100% (4324/4324), 7.21 MiB | 967.00 KiB/s, done.
    Resolving deltas: 100% (2203/2203), done.
    xenophonf@l0000000d:~/src$ cd matano && make install
    cd infra && npm run clean && npm ci && npm run build
    
    > [email protected] clean /home/xenophonf/src/matano/infra
    > rm -rf dist && rm -rf node_modules
    
    added 598 packages in 14.224s
    
    > [email protected] build /home/xenophonf/src/matano/infra
    > rm -rf dist && tsc
    
    cd cli && npm run clean && npm run full-install
    
    > [email protected] clean /home/xenophonf/src/matano/cli
    > rm -rf dist && rm -rf node_modules
    
    
    > [email protected] full-install /home/xenophonf/src/matano/cli
    > npm ci && npm run build && npm uninstall -g matano && npm install -g .
    
    
    > [email protected] preinstall /home/xenophonf/src/matano/cli/node_modules/yarn
    > :; (node ./preinstall.js > /dev/null 2>&1 || true)
    
    added 632 packages in 21.89s
    
    > [email protected] build /home/xenophonf/src/matano/cli
    > rm -rf dist && tsc -b
    
    removed 1 package in 2.042s
    /home/xenophonf/.nvm/versions/node/v12.22.12/bin/matano -> /home/xenophonf/.nvm/versions/node/v12.22.12/lib/node_modules/matano/bin/run
    + [email protected]
    added 1 package from 1 contributor in 0.833s
    xenophonf@l0000000d:~/src/matano$ which matano
    /home/xenophonf/.nvm/versions/node/v12.22.12/bin/matano
    xenophonf@l0000000d:~/src/matano$ matano --help
    (node:24394) SyntaxError Plugin: matano: Unexpected token '.'
    module: @oclif/[email protected]
    task: toCached
    plugin: matano
    root: /home/xenophonf/src/matano/cli
    See more details with DEBUG=*
    (node:24394) SyntaxError Plugin: matano: Unexpected token '.'
    module: @oclif/[email protected]
    task: toCached
    plugin: matano
    root: /home/xenophonf/src/matano/cli
    See more details with DEBUG=*
    (node:24394) SyntaxError Plugin: matano: Unexpected token '?'
    module: @oclif/[email protected]
    task: toCached
    plugin: matano
    root: /home/xenophonf/src/matano/cli
    See more details with DEBUG=*
    █▀▄▀█ ▄▀█ ▀█▀ ▄▀█ █▄░█ █▀█
    █░▀░█ █▀█ ░█░ █▀█ █░▀█ █▄█
    
    Matano - the open source security lake platform for AWS.
    
    VERSION
      matano/0.0.0 wsl-x64 node-v12.22.12
    
    USAGE
      $ matano [COMMAND]
    
    TOPICS
      generate  Utilities to get started and generate boilerplate.
    
    COMMANDS
      autocomplete  display autocomplete installation instructions
      help          Display help for matano.
    
    xenophonf@l0000000d:~/src/matano$ cd ..
    xenophonf@l0000000d:~/src$ rm -rf matano
    xenophonf@l0000000d:~/src$ which matano
    xenophonf@l0000000d:~/src$ ls -l /home/xenophonf/.nvm/versions/node/v12.22.12/bin/matano
    lrwxrwxrwx 1 xenophonf xenophonf 34 Aug 12 11:00 /home/xenophonf/.nvm/versions/node/v12.22.12/bin/matano -> ../lib/node_modules/matano/bin/run
    xenophonf@l0000000d:~/src$ ls -l /home/xenophonf/.nvm/versions/node/v12.22.12/bin/../lib/node_modules/matano/bin/run
    ls: cannot access '/home/xenophonf/.nvm/versions/node/v12.22.12/bin/../lib/node_modules/matano/bin/run': No such file or directory
    xenophonf@l0000000d:~/src$ ls -l /home/xenophonf/.nvm/versions/node/v12.22.12/bin/../lib/node_modules/
    total 0
    lrwxrwxrwx 1 xenophonf xenophonf   32 Aug 12 11:00 matano -> ../../../../../../src/matano/cli
    drwx------ 1 xenophonf xenophonf 4096 Apr  5 03:11 npm
    
    enhancement planned 
    opened by niheconomoum 2
  • Zscaler - Managed log source

    Zscaler - Managed log source

    Add support for Zscaler logs to Matano.

    Sources

    1. Zscaler Internet Access logs (zscaler_zia)

    Tables:

    • alerts
    • dns
    • firewall
    • tunnel
    • web
    1. Zscaler Private Access logs (zscaler_zpa)

    Tables:

    • audit
    • browser_access
    • user_activity
    • user_status

    Steps

    • [ ] Implement all relevant parsers to ECS (proceses from ingest S3 bucket)
    • [ ] Build a managed poller to automatically pull logs from Zscaler
    opened by shaeqahmed 1
  • Implement deduplication for threat intel enrichment ingestion

    Implement deduplication for threat intel enrichment ingestion

    Overview

    Many threat intel sources are not static and are modified/updated. If we are polling for data based on time, this will introduce duplicates.

    Goal

    Add ability to deduplicate data ingested from enrichment sources.

    Notes

    Can be implemented with Athena V3 Iceberg MERGE INTO

    • For enrichment table, have a temp table: table_temp (need to create this table statically)
    • On new data pulled, overwrite temp table with new data. (puller writes to temp table).
    • Inside metadata writer, Execute an Athena query that merges new data from temp table to main table. Query like:
    MERGE INTO enrichment_table main USING enrichment_table_temp new
    	-- primary key
        ON (main.event.id = new.event.id)
        WHEN MATCHED
    		-- all top level cols
            THEN UPDATE SET event = new.event, threat = new.threat
        WHEN NOT MATCHED
    		-- all top level cols
            THEN INSERT (event, threat) VALUES(new.event, new.threat)
    
    opened by Samrose-Ahmed 0
  • matano init should create a unique resource identier

    matano init should create a unique resource identier

    When running multiple inits (assuming a new directory) is used I would expect a new unique identifier would be created each time

     CDKToolkit |  0/12 | 9:06:46 AM | CREATE_FAILED        | AWS::S3::Bucket         | StagingBucket cdk-hnb659fds-assets-XXXXXX-us-east-2 already exists
    
    

    I ran init twice and cdk-hnb659fds seems to be re-used, i would expect this to be unique each run, but maybe this is a constraint of CDK

    When you have multiple repeated failures to deploy this makes cleanup difficult. I would also expect each deployment to have unique roles.

    {
      "version": "20.0.0",
      "files": {
        "70f03c831095bf0345af1dac68037dcb2b95a9fe0c4b4d27738cfad55da1c8c7": {
          "source": {
            "path": "DPCommonStack.template.json",
            "packaging": "file"
          },
          "destinations": {
            "647303185053-us-east-2": {
              "bucketName": "cdk-hnb659fds-assets-XXXXX-us-east-2",
              "objectKey": "70f03c831095bf0345af1dac68037dcb2b95a9fe0c4b4d27738cfad55da1c8c7.json",
              "region": "us-east-2",
              "assumeRoleArn": "arn:${AWS::Partition}:iam::XXX:role/cdk-hnb659fds-file-publishing-role-XXX-us-east-2"
            }
          }
        }
      },
      "dockerImages": {}
    }mfranz@pixel-slate-cros:~$ cat /tmp/matanocdkoutonT9xy/DPCommonStack.assets.json 
    {
      "version": "20.0.0",
      "files": {
        "70f03c831095bf0345af1dac68037dcb2b95a9fe0c4b4d27738cfad55da1c8c7": {
          "source": {
            "path": "DPCommonStack.template.json",
            "packaging": "file"
          },
          "destinations": {
            "647303185053-us-east-2": {
              "bucketName": "cdk-hnb659fds-assets-XXXX-us-east-2",
              "objectKey": "70f03c831095bf0345af1dac68037dcb2b95a9fe0c4b4d27738cfad55da1c8c7.json",
              "region": "us-east-2",
              "assumeRoleArn": "arn:${AWS::Partition}:iam::XXX:role/cdk-hnb659fds-file-publishing-role-XXX-us-east-2"
            }
          }
        }
      },
      "dockerImages": {}
    
    opened by mdfranz 3
  • Make it easier to test VRL transformations + schema changes

    Make it easier to test VRL transformations + schema changes

    Overview

    It is currently difficult to test VRL and schema changes in Matano. It requires a deployment and results in errors that make it hard to ascertain the issue.

    Goal

    Add functionality to be able to test changes to VRL transformations and schema's easily while developing.

    opened by Samrose-Ahmed 0
  • Add CLI command to bulk search for IoC's across data lake

    Add CLI command to bulk search for IoC's across data lake

    Overview

    Currently, it is difficult to search for a known indicator across all/multiple tables in your Matano security lake.

    Goals

    Add a CLI command that automatically searches for a given indicator against all relevant fields in all relevant tables.

    For example, one can provide a malicious IP and it will be searched across columns such as related.ip in all Matano tables that have this field.

    Notes

    • Display a table showing aggregate view of matches in each table
    • Support ability to save matches to file.
    • Be able to search any ECS field and narrow matches by time.
    enhancement 
    opened by Samrose-Ahmed 0
Releases(nightly)
Under the Sea is an official AWS workshop delivered by AWS SAs and AWS Partners to help customers and partners to learn about AIOps with serverless architectures on AWS.

Under the Sea - AIOps with Serverless Workshop Under the Sea is an exciting MMORPG developed by the famous entrepreneur behind Wild Rydes, the most po

AWS Samples 4 Nov 16, 2022
LunaSec - Open Source Security Software built by Security Engineers. Scan your dependencies for Log4Shell, or add Data Tokenization to prevent data leaks. Try our live Tokenizer demo: https://app.lunasec.dev

Our Software We're a team of Security Engineers on a mission to make awesome Open Source Application Security tooling. It all lives in this repo. Here

LunaSec 1.2k Jan 7, 2023
Data lake for dev.

Dev Lake What is Dev Lake? Dev Lake is the one-stop solution that integrates, analyzes, and visualizes the development data throughout the software de

Merico 78 Dec 30, 2022
Learn Web 2.0 and Web 3.0 Development using Next.js, Typescript, AWS CDK, AWS Serverless, Ethereum and AWS Aurora Serverless

Learn Web 2.0 Cloud and Web 3.0 Development in Baby Steps In this course repo we will learn Web 2.0 cloud development using the latest state of the ar

Panacloud Multi-Cloud Internet-Scale Modern Global Apps 89 Jan 3, 2023
MerLoc is a live AWS Lambda function development and debugging tool. MerLoc allows you to run AWS Lambda functions on your local while they are still part of a flow in the AWS cloud remote.

MerLoc MerLoc is a live AWS Lambda function development and debugging tool. MerLoc allows you to run AWS Lambda functions on your local while they are

Thundra 165 Dec 21, 2022
AWS Lambda & Serverless - Developer Guide with Hands-on Labs. Develop thousands line of aws lambda functions interact to aws serverless services with real-world hands-on labs

AWS Lambda & Serverless - Developer Guide with Hands-on Labs UDEMY COURSE WITH DISCOUNTED - Step by Step Development of this Repository -> https://www

awsrun 35 Dec 17, 2022
Metlo is an open-source API security platform.

Metlo API Security Secure Your API. Metlo is an open-source API security platform Create an Inventory of all your API Endpoints. Proactively test your

null 1.1k Dec 28, 2022
Lumos is an AWS Lambda visualizer and open source alternative to AWS CloudWatch.

Lumos Lambda Metrics Visualizer Table of Contents About Lumos Techologies Used Getting Started Key Lambda Metrics How to Contribute License Contributo

OSLabs Beta 36 Nov 5, 2022
Grupprojekt för kurserna 'Javascript med Ramverk' och 'Agil Utveckling'

JavaScript-med-Ramverk-Laboration-3 Grupprojektet för kurserna Javascript med Ramverk och Agil Utveckling. Utvecklingsguide För information om hur utv

Svante Jonsson IT-Högskolan 3 May 18, 2022
Hemsida för personer i Sverige som kan och vill erbjuda boende till människor på flykt

Getting Started with Create React App This project was bootstrapped with Create React App. Available Scripts In the project directory, you can run: np

null 4 May 3, 2022
Kurs-repo för kursen Webbserver och Databaser

Webbserver och databaser This repository is meant for CME students to access exercises and codealongs that happen throughout the course. I hope you wi

null 14 Jan 3, 2023
It shows how to generate and use temparary security credential using AWS STS.

AWS STS를 이용한 Temparary security credential 활용하기 Lambda의 Function URL와 같이 IAM을 이용하여 REST api를 호출할때는 보안상 Temparary security credential를 고려해 볼 수 있습니다. 여기

John Park 5 Nov 20, 2022
An Open-Source Platform to certify open-source projects.

OC-Frontend This includes the frontend for Open-Certs. ?? After seeing so many open-source projects being monetized ?? without giving any recognition

Open Certs 15 Oct 23, 2022
Cloud security platform web with steampipe

cloud-security-platform-web-with-steampipe Home Tech Stacks Node.js + MongoDB + Steampipe Usage If you use ec2, no need credentils, config files. $ cd

rex 5 May 13, 2022
A serverless AWS expense tracker API. AWS Lambda functions, API gateway, and Dynamodb are among the ingredients.

AWS-Serverless-API A serverless AWS expense tracker API. AWS Lambda functions API gateway Dynamodb Endpoints Create a new expense: Method: POST Body f

Ondiek Elijah Ochieng 1 Jul 16, 2022
Everynode allows you to run any version of Node.js in AWS Lambda, in any commercial AWS region

Run Any Node.js Version in AWS Lambda Everynode allows you to run any version of Node.js in AWS Lambda, in any commercial AWS region. We add support f

Fusebit 116 Dec 15, 2022
Deploy an Architect project from GitHub Actions with keys gathered from aws-actions/configure-aws-credentials

Deploy an Architect project from GitHub Actions with keys gathered from a specific AWS IAM Role federated by an IAM OIDCProvider. CloudFormation to cr

Taylor Beseda 4 Apr 6, 2022
A monorepo that uses the AWS Cloud Development Kit to deploy and configure nanomdm on AWS lambda.

NanoMDM on AWS This repo builds and configures a nanomdm server to run on AWS lambda. It uses the Cloud Development Kit and tries to follow best pract

Stevie Clifton 4 May 26, 2022