flink-pi

Simple Scala examples for using Apache Flink's DataStream and SQL API for the same problem, that is simple enough not to involve other libraries.

The example approximates Pi with the Monte Carlo method using Flink's

DataStream API : ds.PiEstimator
SQL API (Blink) : sql.PiEstimator
DataStream and SQL API : mixed.PiEstimator

All three examples have the same functionality, by iteratively approaching better and better estimate for the value of Pi. All write the actual best estimate in the given intervals (3 sec)
till an explicit user interruption.

How it Works

As a continuous DataSource, (random/sequnetial) IDs are generated either by IdGenerator or by DataGen SQL Connector.

The next step is the creation is random point in a 2x2 box , centered in the origin, for each generated ID. The ratio of withinCircle points in the sample estimates Pi/4. Rationale : The area of the 2x2 box is 4, the area of the 1 unit radius circle inside is 1*1*Pi = Pi by definition. As a consequence, the ratio of these two areas are : Pi/4 = P(isWithinCircle)/1=> Pi = 4 * P(isWithinCircle) The data from the millions of trials (resulting 4.0's and 0.0's) are aggregated in two phases. The first phase is on each thread (the bulk of the workload), while the second is a global aggregation.

The output is collected in Print SQL Sink or Flink's DataStreamSink.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

flink-pi

How it Works

Files

README.md

Latest commit

History

README.md

File metadata and controls

flink-pi

How it Works