Skip to content

Commit

Permalink
Add comment.
Browse files Browse the repository at this point in the history
  • Loading branch information
viirya committed May 4, 2019
1 parent d7312fb commit 04a2e04
Showing 1 changed file with 4 additions and 0 deletions.
4 changes: 4 additions & 0 deletions python/pyspark/sql/tests/test_serde.py
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,10 @@ def test_BinaryType_serialization(self):

def test_int_array_serialization(self):
# Note that this test seems dependent on parallelism.
# This issue is because internal object map in Pyrolite is not cleared after op code
# STOP. If we use protocol 4 to pickle Python objects, op code MEMOIZE will store
# objects in the map. We need to clear up it to make sure next unpickling works on
# clear map.
data = self.spark.sparkContext.parallelize([[1, 2, 3, 4]] * 100, numSlices=12)
df = self.spark.createDataFrame(data, "array<integer>")
self.assertEqual(len(list(filter(lambda r: None in r.value, df.collect()))), 0)
Expand Down

0 comments on commit 04a2e04

Please sign in to comment.