Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add real-time analytics with Apache Storm project #991

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
The diff you're trying to view is too large. We only load the first 3000 changed files.
29 changes: 29 additions & 0 deletions sorrentum_sandbox/spring2024/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
FROM python:3.9

# Install Redis
RUN apt-get update && apt-get install -y redis-server

# Install redis-py
RUN pip install redis

# Expose the Redis port
FROM python:3.9

# Install Redis
RUN apt-get update && apt-get install -y redis-server

# Install redis-py
RUN pip install redis

# Expose the Redis port
EXPOSE 6379

# Set working directory
WORKDIR /app

# Copy Python script (assuming your Python script is named app.py)
COPY app.py .

# Start Redis server and run the Python script
CMD ["bash", "-c", "service redis-server start && python app.py"]

706 changes: 706 additions & 0 deletions sorrentum_sandbox/spring2024/LICENSE

Large diffs are not rendered by default.

62 changes: 62 additions & 0 deletions sorrentum_sandbox/spring2024/README.md.old
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@


<!-- toc -->

- [Hello! Nice to meet you](#hello-nice-to-meet-you)
- [Commitment to contribute](#commitment-to-contribute)

<!-- tocstop -->

<!-- <img width="100" alt="image" src="https://user-images.githubusercontent.com/33238329/216777823-851b28ed-7d7a-4b52-9d71-ab38d146edc3.png"> -->

# Hello! Nice to meet you

We are very happy that you are interested in KaizenFow

KaizenFlow is an open-source project to build:

- Machine learning and AI geared towards finance and economics
- Web3 / DeFi protocol

The project aims to combine open-source development, startups, and brilliant
students. We’ve seen this mixture of ingredients work exceptionally well at
Stanford / Berkeley / MIT / etc, where every student seems to be trying to start
a company on the side.

Our goal is to bootstrap the same virtuous cycle outside Silicon Valley so that
instead of just looking for a job, you create your own. We are still figuring
out things as we go, and we are working with University of Maryland and other
interested parties to provide internships, research assistantships, and
development grants.

Besides the immediate financial benefit, this is a unique opportunity for you
to:

- Work on cutting-edge problems on AI, machine learning, and Web3
- Learn about startups and how to start your own project
- Write academic papers
- Get internships and full-time positions at companies working on KaizenFlow
applications or from our network

Most importantly, this is a unique way to be part of a community of individuals
interested in building innovative products.

# Commitment to contribute

This is our only request to you.

We understand that due to your commitments (e.g., classes, life), you might not
be able to work on KaizenFlow consistently. That’s ok. At the same time, please
be aware that taking on a task means that:

1. The same task might not be available to your colleagues; and

2. We spend time helping, training, and mentoring you. So the energy we put into
helping you will be taken away from your colleagues. If you drop out of the
project, our effort could have been used for other teammates that committed
more firmly to making progress

In other words, if you are not sure you can commit a meaningful amount of time
to KaizenFlow (e.g., 20 hours / week), it is wise to wait to be sure you can do
it. If you are excited and want to start, go for it, do your best, and we’ll
make this experience the best possible for you.
Empty file.
141 changes: 141 additions & 0 deletions sorrentum_sandbox/spring2024/changelog.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
# cmamp-1.14.0
- 2024-03-06
- Promote `pyarrow` to 15.0.0
- Changes from:
- #7292 - DEV_TOOLS - Docker - Update pyarrow to latest version

# cmamp-1.13.0
- 2024-02-26
- Install AWS CLI for Linux and Mac architectures
- Changes from:
- #7280 - aws CLI doesn't work in kaizenflow dev_tools ARM container

# cmamp-1.12.0
- 2024-02-20
- Promote `pyarrow` to 14.0.2
- Changes from:
- #7097 - Update pyarrow package

# cmamp-1.11.0
- 2024-01-16
- Promote CCXT to 4.2.13
- Upgrade AWS CLI to V2
- Changes from:
- #526 - Update to AWS CLI v2
- #6756 - Adapt recommendations given by ccxt for bid ask

# cmamp-1.10.0
- 2023-11-13
- Promote Pandas to 2.1.1
- Promote Numpy to 1.26.0
- Changes from #4662 - Promote pandas version

# cmamp-1.9.0
- 2023-09-29
- Promote Python to 3.9
- Changes from #5439 - Update Python to 3.9 (or even to the latest) version

# cmamp-1.8.0
- 2023-09-27
- Remove `mxnet` and `gluonts`
- Changes from #5466 - Remove mxnet, gluonts and disable related tests

# cmamp-1.7.0
- 2023-09-05
- Bump `pytest` to the latest version
- Changes from #5119 - Pytest teardown error

# cmamp-1.6.0
- 2023-08-14
- Add `gspread_pandas`
- Update CCXT

# cmamp-1.5.0
- 2023-08-03
- Multi-architecture Docker image
- Add user_1000
- CmTask4886

# cmamp-1.4.3
- 2023-02-24
- Changes from #3585 IM - Extract v3.0 (Next gen Data QA)
- Changes from #3662 IM - Extract v3.2 (Extend the dataset matrix)#3662

# cmamp-1.4.2
- 2023-01-11
- Changes from #3503 IM - Extract v2.9 (Prod Database, Continue Adapting to kaizenflow)

# cmamp-1.4.1
- 2022-12-19
- Changes from #3165
- Changes from #3341

# cmamp-1.4.0
- 2022-11-15
- Fix Jupytext issue

# cmamp-1.3.0
- 2022-11-14
- Add `cvxopt` and `cvxpy`

# cmamp-1.2.1
- 2022-11-03
- Changes from CmTask#2683

# cmamp-1.2.0
- 2022-10-04
- Update `ccxt`

# cmamp-1.1.1
- 2022-09-02
- Updates to production DAGs (All DAGs now use /amp as submodule)
- Add daily data reconciliation DAG

# cmamp-1.1.0
- 2022-04-18
- Generate `docker-compose` programmatically instead of composing different files

# cmamp-1.0.9
- 2022-03-15
- Update `docker-compose` version

# cmamp-1.0.8
- 2022-03-09
- Add more docker users

# cmamp-1.0.7
- 2022-02-16
- Remove useless Python packages
- Update `pandas`
- See CMTask #1028 for details

# cmamp-1.0.6
- 2022-01-13
- Add more docker users

# cmamp-1.0.5
- 2022-01-05
- Add `importlib-resources` Python package
- Unfreeze the `jsonschema` version

# cmamp-1.0.4
- 2021-12-30
- Add `moto` Python package

# cmamp-1.0.3
- 2021-12-11
- Add `python-dotenv` Python package

# cmamp-1.0.2
- 2021-12-07
- Add `pytest-rerunfailures` Python package

# cmamp-1.0.1
- 2021-12-01
- Add `pytest-timeout` Python package
- Freeze the `awscli` version = `1.22.17`, see #339
- Freeze the `jsonschema` version = `3.2.0`, see #650

# cmamp-1.0.0
- 2021-10-26
- Initial version
145 changes: 145 additions & 0 deletions sorrentum_sandbox/spring2024/conftest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
import logging
import os
from typing import Any, Generator

import helpers.hdbg as dbg
import helpers.hunit_test as hut

# Hack to workaround pytest not happy with multiple redundant conftest.py
# (bug #34).
if not hasattr(hut, "_CONFTEST_ALREADY_PARSED"):

# import helpers.hversion as hversi
# hversi.check_version()

# pylint: disable=protected-access
hut._CONFTEST_ALREADY_PARSED = True

# Store whether we are running unit test through pytest.
# pylint: disable=line-too-long
# From https://docs.pytest.org/en/latest/example/simple.html#detect-if-running-from-within-a-pytest-run
def pytest_configure(config: Any) -> None:
_ = config
# pylint: disable=protected-access
hut._CONFTEST_IN_PYTEST = True

def pytest_unconfigure(config: Any) -> None:
_ = config
# pylint: disable=protected-access
hut._CONFTEST_IN_PYTEST = False

# Create a variable to store the object used by pytest to print independently
# of the capture mode.
# https://stackoverflow.com/questions/41794888
import pytest

@pytest.fixture(autouse=True)
def populate_globals(capsys):
hut._GLOBAL_CAPSYS = capsys

# Add custom options.
def pytest_addoption(parser: Any) -> None:
parser.addoption(
"--update_outcomes",
action="store_true",
default=False,
help="Update golden outcomes of test",
)
parser.addoption(
"--incremental",
action="store_true",
default=False,
help="Reuse and not clean up test artifacts",
)
parser.addoption(
"--dbg_verbosity",
dest="log_level",
choices=["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"],
help="Set the logging level",
)
parser.addoption(
"--dbg",
action="store_true",
help="Set the logging level to TRACE",
)
parser.addoption(
"--image_version",
action="store",
help="Version of the image to test against",
)
parser.addoption(
"--image_stage",
action="store",
help="Stage of the image to test against",
)

def pytest_collection_modifyitems(config: Any, items: Any) -> None:
_ = items
import helpers.henv as henv

_WARNING = "\033[33mWARNING\033[0m"
try:
print(henv.get_system_signature()[0])
except:
print(f"\n{_WARNING}: Can't print system_signature")
if config.getoption("--update_outcomes"):
print(f"\n{_WARNING}: Updating test outcomes")
hut.set_update_tests(True)
if config.getoption("--incremental"):
print(f"\n{_WARNING}: Using incremental test mode")
hut.set_incremental_tests(True)
# Set the verbosity level.
level = logging.INFO
if config.getoption("--dbg_verbosity", None) or config.getoption(
"--dbg", None
):
if config.getoption("--dbg_verbosity", None):
level = config.getoption("--dbg_verbosity")
elif config.getoption("--dbg", None):
level = logging.TRACE
else:
raise ValueError("Can't get here")
print(f"\n{_WARNING}: Setting verbosity level to %s" % level)
# When we specify the debug verbosity we monkey patch the command
# line to add the '-s' option to pytest to not suppress the output.
# NOTE: monkey patching sys.argv is often fragile.
import sys

sys.argv.append("-s")
sys.argv.append("-o log_cli=true")
# TODO(gp): redirect also the stderr to file.
dbg.init_logger(level, in_pytest=True, log_filename="tmp.pytest.log")

if "PYANNOTATE" in os.environ:
print("\nWARNING: Collecting information about types through pyannotate")
# From https://github.com/dropbox/pyannotate/blob/master/example/example_conftest.py
import pytest

def pytest_collection_finish(session: Any) -> None:
"""
Handle the pytest collection finish hook: configure pyannotate.

Explicitly delay importing `collect_types` until all tests
have been collected. This gives gevent a chance to monkey
patch the world before importing pyannotate.
"""
# mypy: Cannot find module named 'pyannotate_runtime'
import pyannotate_runtime # type: ignore

_ = session
pyannotate_runtime.collect_types.init_types_collection()

@pytest.fixture(autouse=True)
def collect_types_fixture() -> Generator:
import pyannotate_runtime

pyannotate_runtime.collect_types.start()
yield
pyannotate_runtime.collect_types.stop()

def pytest_sessionfinish(session: Any, exitstatus: Any) -> None:
import pyannotate_runtime

_ = session, exitstatus
pyannotate_runtime.collect_types.dump_stats("type_info.json")
print("\n*** Collected types ***")
Empty file.
Loading
Loading