Skip to content

Conversation

@gdarmont
Copy link
Contributor

Hi mybatis dev team,

Here's an up to date version of a minimalist Cursor implementation. This PR replaces the previous implementation I made in PR 52 (#52).
I bumped the version to 3.4.0-SNAPSHOT as SqlSession gets new methods.

This PR adds the Cursor feature which is needed to implement a MyBatisCursorItemReader for Spring Batch.

Features

  • Added new SqlSession methods selectCursor()
  • Cursors extends Iterable to be easily used with foreach Statement and implements Closeable to be used with try-with-resources in Java 7+
  • Cursors are automatically closed when fully fetched or on SqlSession close

A Cursor does not retain any fetched element, so it is a perfect choice for iterating over a ResultSet with several millions rows, without exhausting memory.

Usage :

XML mapping :

    <select id="getEmployeeNestedCursor" resultMap="results" resultOrdered="true">
       select id,name,salary,skill from employees order by id
    </select>

Java code :

// Get a list which can have millions of rows
Cursor<Employee> employees = sqlSession.selectCursor("getEmployeeNestedCursor");

// Get an iterator on employees
Iterator<Employee> iter = employees.iterator();

List<Employee> smallChunk = new ArrayList(10);
while (iter.hasNext()) {
    // Fetch next 10 employees
    for(int i = 0; i<10 && iter.hasNext(); i++) {
        smallChunk.add(iter.next());
    }
    doSomethingWithAlreadyFetchedEmployees(smallChunk);
    smallChunk.clear();
}

Best regards,
Guillaume

gdarmont added 2 commits July 10, 2015 00:09
This change is due to the introduction of new methods for Cursor handling in SqlSession interface
* Added new SqlSession methods selectCursor()
* Cursors extends Iterable to be easily used with foreach Statement and implements Closeable to be used with try-with-resources in Java 7+
* Cursors are automatically closed when fully fetched or on SqlSession close

  This feature is useful to implement functionality such as the "MyBatisCursorItemReader" in MyBatis Spring.
emacarron added a commit that referenced this pull request Jul 11, 2015
@emacarron emacarron merged commit 33c1e9f into mybatis:master Jul 11, 2015
@emacarron
Copy link
Member

Its been a long time @gdarmont but thank you very much for looking again into this.

@emacarron emacarron added the enhancement Improve a feature or add a new feature label Jul 11, 2015
@emacarron emacarron added this to the 3.4.0 milestone Jul 11, 2015
@emacarron emacarron self-assigned this Jul 11, 2015
kazuki43zoo added a commit to kazuki43zoo/mybatis-3 that referenced this pull request Jul 12, 2015
* modify to use Cursor.class.isAssignableFrom(Class<?>)
* modfiy to use ExceptionFactory.wrapException(String,Exception)
* modify copyright year
emacarron added a commit that referenced this pull request Jul 12, 2015
kazuki43zoo added a commit to kazuki43zoo/mybatis-3 that referenced this pull request Jul 12, 2015
@kazuki43zoo
Copy link
Member

Hi @emacarron @gdarmont,

https://github.com/mybatis/mybatis-3/blob/master/src/main/java/org/apache/ibatis/cursor/defaults/DefaultCursor.java#L102-L104

    protected T fetchNextUsingRowBound() {
        T result = fetchNextObjectFromDatabase();
        while (currentIndex < rowBounds.getOffset()) {
            result = fetchNextObjectFromDatabase();
        }
        return result;
    }

Logic to skip records until offset is this ?
I think this logic(way to skip) should be discuss to improve.
How do everyone think ?

@alexey-su
Copy link
Contributor

What happens when you call again?

Iterator<?> it = cursor.iterator();

cursor.close(); // first call
cursor.close(); // ??? second call

it.hasNext(); // ??? after cursor close()

Nose smell that fetchNextObjectFromDatabase method does not check the value of the attribute "opened".

I stand by @kazuki43zoo.
Set RowBounds.offset = 1000
And actual 10 rows in table.
Your method will go into an infinite loop.

Your code is not optimal. Before the walk up to the first RowBounds.offset, you raise and lose a lot of objects. You do not take into account that could go ResultMap calls nested SQL queries.
For example:

<association property="type" column="TYPE_ID" jdbcType="VARCHAR"
  javaType="DocumentTypeEntity" select="selectType"/>

@gdarmont
Copy link
Contributor Author

Hi all,

Thanks for the feedback. I admit I did not make tests to handle corner cases, and as you mentioned there may be problems (several close() calls, not enough rows etc...).
This part can easily be fixed. I will push fixes and corresponding tests ASAP.

I think it is worth explaining the origin of Cursor :
The need of a MyBatis Cursor comes from the need of having a SpringBatch ItemReader that can handle several millions items with only one request, without exhausting memory. Currently the MyBatisPagingItemReader make one request per chunk (this can be very time consuming for requests on large datasets) and can get wrong results if paging is done on request with nested resultmaps (see tests in https://github.com/gdarmont/mybatis-spring/blob/943797d2fd036efa293fdb65a8b76794d5c2105d/src/test/java/org/mybatis/spring/batch/SpringBatchTest.java)

Requirement : SQL requests used with selectCursor() needs to be ordered (resultOrdered="true" in mapping), to allow intermediate result objects to be released (https://github.com/mybatis/mybatis-3/blob/master/src/main/java/org/apache/ibatis/executor/resultset/DefaultResultSetHandler.java#L771-L773), otherwise it wouldn't be very memory efficient for very large resultset. Result ordering is the only constraint that makes selectCursor() different from selectList().

RowBound part : The only reason I integrated a selectCursor() method taking a RowBound is to have a one to one Cursor equivalent to selectList().
But the semantic is not the same, and as such, it maybe a better solution to simply remove the selectCursor(String,Object,RowBound) method for now. Or introduce a new type of RowBound.

Currently a RowBound unit (= value of 1) in MyBatis is mapped to a row in a ResultSet (a database row).
The RowBound unit used in Cursor is mapped to a "toplevel" object (a List or Cursor element, mapped from 1 or more database rows). That'w why we may need a new object (ObjectBound ?)

The loop part is used to mimic the offset advance used in DefaultResultSetHandler : https://github.com/mybatis/mybatis-3/blob/master/src/main/java/org/apache/ibatis/executor/resultset/DefaultResultSetHandler.java#L348-350 . And since MyBatis can transform several rows into one "toplevel" object, I don't currently see how it could be more efficient. Of course, this triggers a lot of garbage since objects are created but never used.

But it seems to me that RowBound in general is only here to ease life on few situations (since it can lead to wrong results, as seen above when we use nested collections mappings). A developer with performance concerns will craft the SQL request to only compute the needed data, and so, won't have the need of the RowBound feature.

To sum up, here's a small todo list :

  • Fixes and tests for the fixes
  • Discuss about the RowBound pertinence on selectCursor() and if a new RowBound (ObjectBound ? name is open to discussion) type is needed. IMO, I think it would be better to introduce an ObjectBound. This feature could be then extended to other parts of MyBatis.

@alexey-su For the nested SQL queries, the current code simply rely on the core mapping features of MyBatis and as such, should handle them seamlessly. Do you have a reproduce test so that I can check on my side ?

Hope this helps.
Guillaume

@aaaweisen
Copy link

我用的3.4.2 cursor好像不起啥作用

pulllock pushed a commit to pulllock/mybatis-3 that referenced this pull request Oct 19, 2023
pulllock pushed a commit to pulllock/mybatis-3 that referenced this pull request Oct 19, 2023
* modify to use Cursor.class.isAssignableFrom(Class<?>)
* modfiy to use ExceptionFactory.wrapException(String,Exception)
* modify copyright year
pulllock pushed a commit to pulllock/mybatis-3 that referenced this pull request Oct 19, 2023
pulllock pushed a commit to pulllock/mybatis-3 that referenced this pull request Oct 19, 2023
pulllock pushed a commit to pulllock/mybatis-3 that referenced this pull request Oct 19, 2023
…ishing

Revert "mybatis#437: Polishing some logics and comments"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Improve a feature or add a new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants