Skip to content

Conversation

@gdarmont
Copy link
Contributor

Hi mybatis dev team,

I don’t know if creating directly a pull request is the right way to submit a feature for MyBatis, or if there’s another workflow. If so, please let me know.

This PR adds a feature which is needed to implement a MyBatisCursorItemReader for Spring Batch (see mybatis/spring#8 )

The feature allows SqlSession.selectList() to return a custom list implementation (CursorList) that will fetch item “on demand” while iterating over.

CursorList does not retain any fetched element, so it is a perfect choice for iterating over a ResultSet with several millions rows, without exhausting memory.

The drawback is that one can only iterate one time over a CursorList.

If several iterations over the list are needed, LazyList is available. It’s a simple wrapper over CursorList that keeps already fetched elements in a local storage.

Since CursorList fetches elements on demand, it needs to keep the ResultSql open. It’s automatically close when ResultSet is fully consumed. If resultset is partially consumed, it’s user responsability to close CursorList correctly using CursorList#closeResultSetAndStatement().

Usage :

XML mapping :

    <select id="getEmployeeNestedCursor" resultMap="results" fetchType="CURSOR" resultOrdered="true">
       select id,name,salary,skill from employees order by id
    </select>

Java code :

// Get a list which can have millions of rows
List<Employee> employees = sqlSession.selectList("getEmployeeNestedCursor");

// Get an iterator on employees
Iterator<Employee> iter = employees.iterator();

List<Employee> smallChunk = new ArrayList(10);
while (iter.hasNext()) {
    // Fetch next 10 employees
    for(int i = 0; i<10 && iter.hasNext(); i++) {
        smallChunk.add(iter.next());
    }
    doSomethingWithAlreadyFetchedEmployees(smallChunk);
    smallChunk.clear();
}

Best regards,
Guillaume

gdarmont added 2 commits May 28, 2013 15:16
* This allows MyBatis to return immediately from a selectList() call, returning a list that will fetch 'ondemand' elements database
  This behavior is especially usefull to implement functionnality such a "MyBatisCursorItemReader" in MyBatis Spring.

* The returned list keeps an open connection, that should be explicitely closed by MyBatis client, using CursorList#closeResultSetAndStatement()
@alexey-su
Copy link
Contributor

You need to use an implementation of the interface ResultHandler.

public interface AbonentMapper {
    // <select id="findAbonents" resultMap="abonentMap">SELECT * FROM ABONENTS</select>
    void findAbonents(ResultHandler handler);
}

public class ProviderAbonentsImpl {
    private void fetchAbonents(SqlSession sqlSession) {
        AbonentMapper mapper = sqlSession.getMapper(AbonentMapper.class);
        mapper.findAbonents(new AbonentHandler());
    }

    private class AbonentHandler implements ResultHandler {
        @Override
        public void handleResult(ResultContext context) {
            Abonent abonent = (Abonent) context.getResultObject();
            ...
        }
    }
}

@fengxx
Copy link
Contributor

fengxx commented Jan 23, 2014

It is a very nice feature to have for processing large result set as stream.

@pgaertig
Copy link

+1 for that PR because ResultHandler is push type of data passing, where above cursor list implementation gives us pull. So you can easily pass an iterator to higher app tiers without introducing custom callback wrappers. However I don't like the way to solely count on user closing the cursor in the middle. This could be hooked somehow to session closing routine. Similarly as Hibernate is doing that with ScrollableResults.

@alexey-su
Copy link
Contributor

Iterator from list not solve the problem stop fetch data.
RowBounds correctly stops fetch data.

public interface AbonentMapper {
    // <select id="findAbonents" resultMap="abonentMap">SELECT * FROM ABONENTS</select>
    void findAbonents(ResultHandler handler, RowBounds rowBounds);
}

public class ProviderAbonentsImpl {
    private void fetchAbonents(SqlSession sqlSession, RowBounds rowBounds) {
        AbonentMapper mapper = sqlSession.getMapper(AbonentMapper.class);
        ProxyRowBounds localRowBounds = new ProxyRowBounds(rowBounds.getOffset(), rowBounds.getLimit());
        mapper.findAbonents(new AbonentHandler(localRowBounds), localRowBounds);
    }

    private class AbonentHandler implements ResultHandler {
        private final ProxyRowBounds rowBounds;

        public AbonentHandler(ProxyRowBounds rowBounds) {
            this.rowBounds = rowBounds;
        }

        @Override
        public void handleResult(ResultContext context) {
            Abonent abonent = (Abonent) context.getResultObject();
            ...
            if(...)
               rowBounds.stop();
        }
    }

    private class ProxyRowBounds extends RowBounds {
        private int limit;
        public ProxyRowBounds() { this(RowBounds.NO_ROW_OFFSET, RowBounds.NO_ROW_LIMIT); }
        public ProxyRowBownds(int offset, int limit) {
           super(offset, limit);
           this.limit = limit;
        }
        public int getLimit() { return this.limit; }
        public void stop() { this.limit = 0; }
    }
}

We have to create your own interface from "Itarable". Add method of stopping the cycle.
interface "List" is very overloaded with unnecessary methods.

@gdarmont
Copy link
Contributor Author

The proposed Iterator implementation won't fetch new data unless you call the hasNext() or next() methods. See the "Java code" section in the PR description.

From what I've seen RowBounds works at the JDBC ResultSet level : it will stop fetching rows when reaching the row number matching the limit. If the limit occurs in the middle of a <collection> mapping, the resulting data will be incomplete. The proposed implementation ensure that <collection> mappings will contains all matching data. The constraint is that the query has to be configured with resultOrdered="true" and a correct ORDER BY statement.

@pgaertig Agreed on the fact that the close could be handled by session.close() or before any other query on the same Session object.

@alexey-su
Copy link
Contributor

Usually use Datasource, and not a direct Driver.getConnection(url). Without closing ResultSets and Statements expend expensive resurss database server and pool Datasource.
In one project, have been searching for such a mistake. Server stopped working after 30 minutes under heavy load.

Processing RowBounds.limit guaranteed closes ResultSet and Statemet.
The break statement loses all access to the data objects.

Another example. Method Session.close() does not always lead to the closing of the connection. For example, in JTA + XA (JEE beans). UserTransaction commit/rollback lead to a real close XAConnection.

Example of using JTA in Camel Java DSL:

from("file:{{scanPath}}")
.transaction("required")
.split().method(new ReadSplitter(), "read").stream()
  .filter().method(new DataFilter(), "filter")
     .process(new StoreProcess())
  .end()
.end();

What gives CursorList? Only one method iterator().
All other methods, of which more than a dozen, give oin the same answer - Unsupported.
For this reason, necessary to implement Iterable instead List.

Do you think many will remember that sqlSession.selectList(...) returned implementation CursorList?
This is a potential error. Everyone will hope that MyBatis neatly closes the cursor and the query as it has always happened .

@gdarmont
Copy link
Contributor Author

Be sure I know all of these considerations. The implementation is currently used in a JTA environment with 2 JDBC datasources and several JMS queues, without any problem.

The only reason I made a CursorList instead of a CursorIterator is because the SqlSession object only offers methods with List as result. Changing the SqlSession object seemed a too big change for a PR.
Adding methods to SqlSession could be made on a 4.0-SNAPSHOT branch, as breaking compatibility is not desirable on a minor version change.

One thing is sure : a cursor behavior is a real need, whatever the implementation provided. I hope it will be added in MyBatis (using this PR or from a more global rework)

@emacarron
Copy link
Member

Hi Guillaume,

In the last version we released we are still stabilizing 3.2.x.

I do not think we need to go so far as 4.0 to add new methods to SqlSesion. 3.3 is ok for that as long as we just break "implementors". So feel free to add them.

@alexey-su
Copy link
Contributor

Implementing Iterable very useful.
Useful feature is the implementation of
SELECT ... FOR UPDATE

For Postgresql and etc
statementCursor.setCursorName(#{cursor_name})
and then execute the update request

UPDATE ... WHERE CURRENT OF #{cursor_name}
DELETE ... WHERE CURRENT OF #{cursor_name}

Here are my sketches of the future functional cursor

interface Cursor<T> extends Iterable<T> {
   String getCursorName();      // optional for Postgresql and etc
   setUpdateQuery(String name); // optional for Postgresql and etc
   setDeleteQuery(String name); // optional for Postgresql and etc

   boolean isAutoClose(); // auto close afte fetch last data
   void setAutoClose(boolean autoClose);
   boolean isClose();

   boolean isUpdatable(); // updatebe cursor or not

   // Iterator<T> iterator(); from extended Iterable
   update(T rowData);  // update current fetch data if updatable cursor
   remove();           // remove current fetch data if updatable cursor
   close();            // close cursor (ResultSet and Statement)
}

@emacarron emacarron added this to the 3.3.0 milestone Mar 12, 2014
@emacarron
Copy link
Member

@gdarmont Next version will be 3.3.0. It is ok to introduce new methods to SqlSession. Can you please update the PR?

Thanks in advance!

@gdarmont
Copy link
Contributor Author

@emacarron I won't have enough time right now to update the PR, though this is definitely something I have in mind.
Don't wait for the PR to be updated if you have an ETA to fulfill.

@emacarron
Copy link
Member

Hi @gdarmont

No hurry at all. We are volunteers and there are no ETAs to fulfill at all. (This probably explains why it took us 9 months to pay attention to this enhacement).

@emacarron emacarron self-assigned this Dec 22, 2014
@bwzhang2011
Copy link

@emacarron, when 3.3 release and include such useful feature.

@emacarron emacarron modified the milestones: 3.4.0, 3.3.0 Apr 25, 2015
@gdarmont gdarmont mentioned this pull request Jul 11, 2015
@gdarmont
Copy link
Contributor Author

Closing this PR as the feature has been reworked in PR #437

@gdarmont gdarmont closed this Jul 11, 2015
kazuki43zoo added a commit to kazuki43zoo/mybatis-3 that referenced this pull request Jul 12, 2015
* modify to use Cursor.class.isAssignableFrom(Class<?>)
* modfiy to use ExceptionFactory.wrapException(String,Exception)
* modify copyright year
@kazuki43zoo
Copy link
Member

This new feature is great 👍
I hope early release!!

@emacarron
Copy link
Member

Yep. So do I. We will need some docs first otherwise very few people may get to this new feature.

BTW people in this thread @alexey-su @fengxx @pgaertig @bwzhang2011 can you please have a look at the latest @gdarmont 's PR and post your thoughts?? (please go #437)

@emacarron emacarron removed this from the 3.4.0 milestone Apr 9, 2016
@emacarron emacarron removed the enhancement Improve a feature or add a new feature label Apr 9, 2016
@emacarron emacarron removed their assignment Apr 9, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants