Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions python/pyspark/sql/tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -1435,6 +1435,12 @@ def test_time_with_timezone(self):
self.assertEqual(now, now1)
self.assertEqual(now, utcnow1)

# regression test for SPARK-19561
def test_datetime_at_epoch(self):
epoch = datetime.datetime.fromtimestamp(0)
df = self.spark.createDataFrame([Row(date=epoch)])
self.assertEqual(df.first()['date'], epoch)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, before this patch, df.first() is Row(None) in this case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make a test case in class DataTypeTests(unittest.TestCase) instead?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, before this patch, df.first() is Row(None).

I tried putting it in DataTypeTests first, but it was difficult to get a reasonable failing test case there. Python ints are up to 2^63 on 64-bit systems, so it doesn't overflow to long there. The issue is b/c Scala int are 32-bit, so Py4J is the part that converts it to long.

We could put the test there, but it doesn't really capture the issue IMO.


def test_decimal(self):
from decimal import Decimal
schema = StructType([StructField("decimal", DecimalType(10, 5))])
Expand Down
2 changes: 1 addition & 1 deletion python/pyspark/sql/types.py
Original file line number Diff line number Diff line change
Expand Up @@ -189,7 +189,7 @@ def toInternal(self, dt):
if dt is not None:
seconds = (calendar.timegm(dt.utctimetuple()) if dt.tzinfo
else time.mktime(dt.timetuple()))
return int(seconds) * 1000000 + dt.microsecond
return long(seconds) * 1000000 + dt.microsecond
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. For me, it looks every review comments are applied.


def fromInternal(self, ts):
if ts is not None:
Expand Down