Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Utility Implementation Meta Issue #3

Open
9 of 13 tasks
fschueler opened this issue Mar 2, 2016 · 3 comments
Open
9 of 13 tasks

Utility Implementation Meta Issue #3

fschueler opened this issue Mar 2, 2016 · 3 comments

Comments

@fschueler
Copy link

  • DataSetConverterUtil
    • CSVToBinaryBlock
    • BinaryBlockToCSV
    • TextCellToBinaryBlock
    • BinaryCellToBinaryBlock
    • stringtoSerializableText
  • DataSetAggregateUtils
    • sumStable
    • mergeByKey
    • aggStable
    • aggByKeyStable
  • DataSetUtils
    • computeNNZFromBlocks
@FelixNeutatz
Copy link

I implemented a first working version for TextCellToBinaryBlock. But there is a problem:

They use a flatMap for TextToBinaryBlockFunction. There they create a buffer and only submit records when the buffer is full. So I would need something like this:

private static class TextToBinaryBlockFunction extends RichFlatMapFunction<String,Tuple2<MatrixIndexes,MatrixBlock>>
{
    ReblockBuffer rbuff = null;
    Collector<Tuple2<MatrixIndexes,MatrixBlock>> outHandle = null;

    @Override
    public void open(Configuration parameters) {
        rbuff = new ReblockBuffer();
    }

    @Override
    public void flatMap(String text, Collector<Tuple2<MatrixIndexes,MatrixBlock>> out) 
    {
        outHandle = out;

        //flush buffer if necessary
        if (rbuff.getSize() >= rbuff.getCapacity()) {
            flushBufferToList(out, rbuff);
        }

        //add value to reblock buffer
        rbuff.appendCell(text);         
    }

    @Override
    public void close() throws Exception {
        //final flush buffer
        flushBufferToList(outHandle, rbuff);
    }
}

But the problem is that it seems that it is not possible to emit records in the close() method. Does anybody has an idea? Currently I just emit everytime which at least works ...

@FelixNeutatz
Copy link

MapPartition solves the problem :)

@FelixNeutatz
Copy link

I finished ReblockFLInstruction overall

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants