Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set transaction flag for database queries #2047

Closed
raymondlewallen opened this issue Feb 28, 2017 · 5 comments
Closed

Set transaction flag for database queries #2047

raymondlewallen opened this issue Feb 28, 2017 · 5 comments

Comments

@raymondlewallen
Copy link

Would like a way to set enable transaction true/false for Postgres. This would enable to run some commands that require to be run outside of a transaction block, such as redshift "VACUUM" statement. I've started working on overriding the query code to allow me to do this.

@dlstadther
Copy link
Collaborator

@raymondlewallen Feel free to contribute!

It looks like you'd need to update PostgresQuery and then RedshiftQuery (as a courtesy :) ).

I've had this same need (to execute vacuum). My solution was to just manually implement the execution and output myself (below).

def run(self):
       connection = psycopg2.connect('dbname={db} host={host} port={port} user={user} password={pass}'.format(args_dict))
        cursor = connection.cursor()

        # change isolation level temporarily to allow vacuum
        old_isolation_level = connection.isolation_level
        connection.set_isolation_level(0)
        cursor.execute(self.query)   # self.query is just 'vacuum;'
        connection.commit()
        connection.set_isolation_level(old_isolation_level)

        # output - update market table
        connection_target = self.output().connect()
        self.output().touch(connection_target)
        connection_target.commit()
        connection_target.close()

see here and here for two psycopg2 solutions to execute a vacuum.

@dmohns
Copy link
Contributor

dmohns commented May 19, 2017

One way I can think of is to specify some sort of dictionary in the PostgresQuery tasks. Then it could get passed to run and called from connection.set_session(<dict>). That way we get everything from here http://initd.org/psycopg/docs/connection.html#set_session , in particular autocommit=True allows to run VACUUM queries.

I am uncertain though: Is this the luigi way of doing things?

@dlstadther
Copy link
Collaborator

dlstadther commented Jul 9, 2017

I've started working on overriding the query code to allow me to do this.

@raymondlewallen Any update here? Feel free to open a WIP PR if necessary.

@dckc
Copy link
Contributor

dckc commented Aug 28, 2017

I spent hours figuring out why I couldn't get direct path loads in Oracle inserts inside luigi. It turned out to be a sqlalchemy thing, but just in case it helps other luigi users: twophase defaults to on and causes everything to run inside a transaction (which conflicts with direct path loads somehow). Turning it off was kinda kludgy, but it worked:

kumc-bmi/grouse@57ad98a#diff-30e6ced1669bd1fd79ea02fa07f5c143R120

@jamesmcm
Copy link
Contributor

With the merged PR #2242 in the future you will be able to set the autocommit property to True for the RedshiftQuery that runs the VACUUM (and other such operations), this solves the Redshift case at least.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants