Skip to content
This repository has been archived by the owner on Mar 5, 2024. It is now read-only.
/ GHOSTS-ANIMATOR Public archive

GHOSTS Animator is a library and API for generating realistic NPCs for training and exercise.

License

Notifications You must be signed in to change notification settings

cmu-sei/GHOSTS-ANIMATOR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

93 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GHOSTS Animator

A configurable and extensible library for generating modeling, simulation and exercise data. As we say, "NPCs so real, they sell for a premium on the dark web."

At its core, Animator is a hyper-realistic user details generator. Its primary function is to create fake identities and accompanying verbose portfolios of personal information. Each generated user, or NPC (Non-Player Character) as we call them, has over 25 categories of details associated with them, and over a hundred pieces of metadata defining who they are. Each piece of information is generated using sourced datasets in an attempt to distribute characteristics realistically.

Image of NPC Profile

Quick Start

Documentation for Animator is here with easy access to the rest of the GHOSTS framework as well.

  1. git clone https://github.com/cmu-sei/GHOSTS-ANIMATOR
  2. cd ghosts-animator/src
  3. docker build . -t ghosts/animator
  4. docker compose up -d
  5. Browse to http://localhost:5000/swagger

Why Use Animator

The data generated by Animator can be leveraged in multiple areas, but is particularly applicable in four key areas:

Training Machine Learning Algorithms

Animator creates larges sets of hyper-realistic user data. It can be leveraged to generate data sets that can be used for training machine learning algorithms. This enables the rapid training of anthropology-related ML algorithms that can leverage one or more of the hundred-plus data points generated by Animator.

Honeypot Payloads

NPC details generated by Animator are designed to be as realistic as possible given the available relevant open-source information. This makes the user data convincingly real while still being completely fabricated. Therefore, the data is ideal for use in applications like honeypots, where the goal is to trick an attacker into thinking they are compromising an asset with real user data. This data is also perfect for any other application that would benefit from extremely realistic user information.

Insider Threat Modeling

Each Animator NPC is given an Insider Threat Profile. This profile determines how likely it is that the NPC is an insider threat by incorporating the CDSE's Insider Threat Potential Indicators. As we continue developing Animator, it will be possible to configure NPCs to be more or less likely to be insider threats based on factors such as their finances, criminal history, foreign contacts, and mental health.

Social Network and Relationship Modeling

Animator can establish relationships between the NPCs it generates. As we increase the fidelity of inter-NPC relationships, Animator NPCs create larger and more realistic social networks. By leveraging Animator's ability to quickly generate thousands of inter-related NPCs, Animator can easily be used to perform social networking modeling and research.

How it Works

  1. The Animator API is built on .NET Core. Once running, it stands up a server and Restful API that can be accessed to create NPCs.
  2. Once Animator receives a request to create NPCs, it starts by creating an empty NPC Profile.
  3. Animator then iterates through all 100+ data points for the NPC and generates synthetic data to be associated with that NPC.
    • Example data points are name, address, mental health, career, finances, and family members.
  4. Data points are either generated at random or are generated using weighted randomization. Weighted randomization involves leveraging verified datasets to influence the distribution of randomly generated data points to match much more closely to reality.
    • Note that our primary goal in Animator is to be as realistic with our data as possible. As we develop Animator, we aim to use weighted randomization for as many data points as we can find datasets to support.
  5. Animator will complete this process for as many users as were selected by the request. This information can be exported through the API, or stored in a local database
    • Animator currently supports storing NPC data in a local Mongo Database. This feature is still being actively improved.

Image of Animator Relationship Network

Sources

Below are the sources we've used in Animator to date:


GHOSTS Animator began as a port of Faker.Net, which is itself a C# port of faker for Ruby. Similar libraries exist for python and other programming languages.


[DISTRIBUTION STATEMENT A] This material has been approved for public release and unlimited distribution.

Copyright 2020 Carnegie Mellon University. All Rights Reserved. See LICENSE.md file for terms.