diff --git a/.gitignore b/.gitignore index 2cabbca..0e4f06f 100644 --- a/.gitignore +++ b/.gitignore @@ -1,3 +1,5 @@ -*/ +accounts/ *.json +.idea/ token.pickle +autorclone.conf diff --git a/README.md b/README.md index 10d54ee..9a4c595 100644 --- a/README.md +++ b/README.md @@ -1,12 +1,34 @@ -folderclone - A project that allows you copy large folders to Shared Drives. -================================= +# Autorclone -Requirements for using the scripts ---------------------------------- +A python script lists Which Use Service Account to bypass the 750G upload file size limit on Google Drive +based on [folderclone](https://github.com/Spazzlo/folderclone) and [Multifolderclone](https://github.com/SameerkumarP/Multifolderclone) -* Python 3.4+ **(Use 64-Bit Python only)** -* The following modules from pip3: `google-api-python-client`, `google-auth-oauthlib`, `google-auth-httplib2` & `httplib2shim` +Different from The exist project, This repo use [Rclone](https://rclone.org) to **transfer files from local disk +to Google Drive or Team Drive**. -`multifolderclone.py` Setup ---------------------------------- -[`multifolderclone.py` Setup](https://github.com/Spazzlo/folderclone/blob/master/README_multifolderclone.md) +## Requirements for using the scriptsCONTROLLER + +* Python ^3.4 **(Use 64-Bit Python only)** +* Python Library which list in `requirements.txt` +* Rclone ^1.41 (To support `service_account_credentials` feature ) + +## Setup + +> Chinese Version: [使用Service Account突破rclone单账号GD每日750G上传限制](blog.rhilip.info/archives/1135/) + +1. setup `multifactory.py` + 1) Head over to and sign in with your account. + 2) Click "Library" on the left column, then click on "Select a project" at the top. Click on `NEW PROJECT` on the top-right corner of the new window. + 3) In the Project name section, input a project name of your choice. Wait till the project creation is done and then click on "Select a project" again at the top and select your project. + 4) Select "OAuth consent screen" and fill out the **Application name** field with a name of your choice. Scroll down and hit "Save" + 5) Select "Credentials" and select Create credentials. Choose "OAuth client ID". Choose "Other" as your **Application type** and hit "Create". Hit "Ok". You will now be presented with a list of "OAuth 2.0 client IDs". At the right end, there will be a download icon. Select it to download and save it as `credentials.json` in the script folder. + 6) Find out how many projects you'll need. For example, a 100 TB job will take approximately 135 service accounts to make a full clone. Each project can have a maximum of 100 service accounts. In the case of the 100TB job, we will need 2 projects. `multifactory.py` conveniently includes a quick setup option. Run the following command `python3 multifactory.py --quick-setup N`. **Replace `N` with the amount of projects you need!**. If you want to only use new projects instead of existing ones, make sure to add `--new-only` flag. It will automatically start doing all the hard work for you. + 6a) Running this for the first time will prompt you to login with your Google account. Login with the same account you used for Step 1. If will then ask you to enable a service. Open the URL in your browser to enable it. Press Enter once it's enabled. + +2. Steps to add all the service accounts to the Shared Drive + 1) Once `multifactory.py` is done making all the accounts, open Google Drive and make a new Shared Drive to copy to. + 2) Run the following command `python3 masshare.py -d SDFolderID`. Replace the `SDFolderID` with `XXXXXXXXXXXXXXXXXXX`. The Folder ID can be obtained from the Shared Drive URL `https://drive.google.com/drive/folders/XXXXXXXXXXXXXXXXXXX`. `masshare.py` will start adding all your service accounts. + +3. Steps for `autorclone.py` + 1) Change script config at the beginning of file. + 2) Add to crontab like `0 */1 * * * /usr/bin/python /path/to/autorclone.py` diff --git a/README_multifolderclone.md b/README_multifolderclone.md deleted file mode 100644 index c2e472f..0000000 --- a/README_multifolderclone.md +++ /dev/null @@ -1,29 +0,0 @@ -Steps on how to use `multifolderclone.py` -================================= - -Steps to setup `multifactory.py` ---------------------------------- -1) Head over to and sign in with your account. -2) Click "Library" on the left column, then click on "Select a project" at the top. Click on `NEW PROJECT` on the top-right corner of the new window. -3) In the Project name section, input a project name of your choice. Wait till the project creation is done and then click on "Select a project" again at the top and select your project. -4) Select "OAuth consent screen" and fill out the **Application name** field with a name of your choice. Scroll down and hit "Save" -5) Select "Credentials" and select Create credentials. Choose "OAuth client ID". Choose "Other" as your **Application type** and hit "Create". Hit "Ok". You will now be presented with a list of "OAuth 2.0 client IDs". At the right end, there will be a download icon. Select it to download and save it as `credentials.json` in the script folder. -6) Find out how many projects you'll need. For example, a 100 TB job will take approximately 135 service accounts to make a full clone. Each project can have a maximum of 100 service accounts. In the case of the 100TB job, we will need 2 projects. `multifactory.py` conveniently includes a quick setup option. Run the following command `python3 multifactory.py --quick-setup N`. **Replace `N` with the amount of projects you need!**. If you want to only use new projects instead of existing ones, make sure to add `--new-only` flag. It will automatically start doing all the hard work for you. -6a) Running this for the first time will prompt you to login with your Google account. Login with the same account you used for Step 1. If will then ask you to enable a service. Open the URL in your browser to enable it. Press Enter once it's enabled. - -Steps to add all the service accounts to the Shared Drive ---------------------------------- -1) Once `multifactory.py` is done making all the accounts, open Google Drive and make a new Shared Drive to copy to. -2) Run the following command `python3 masshare.py -d SDFolderID`. Replace the `SDFolderID` with `XXXXXXXXXXXXXXXXXXX`. The Folder ID can be obtained from the Shared Drive URL `https://drive.google.com/drive/folders/XXXXXXXXXXXXXXXXXXX`. `masshare.py` will start adding all your service accounts. - -**Shared Drives can only fit up to 600 users!** - -Steps to clone a public folder to the Shared Drive ---------------------------------- -1) Run the following command, `python3 multifolderclone.py -s SourceFolderID -d SDFolderID`. Replace `SourceFolderID` with the folder ID of the folder you are trying to copy and replace `SDFolderID` with the same ID as used in step 2 in `Steps to add service accounts to a Shared Drive`. It will start cloning the folder into the Shared Drive. - -Steps to *sync* a public folder to the Shared Drive ---------------------------------- -`multifolderclone.py` will now know if something's been copied already! Run the command again to copy over any new or missing files. *`multifolderclone.py` will not delete any files in the destination **not** in the source* - -### As always, use the [Issues](https://github.com/Spazzlo/folderclone/issues) tab for any bugs, issues, feature requests or documentation improvements. diff --git a/autorclone.py b/autorclone.py new file mode 100644 index 0000000..f6c2f7a --- /dev/null +++ b/autorclone.py @@ -0,0 +1,154 @@ +import os +import json +import time +import glob +import logging +import subprocess + +import filelock + +# ------------配置项开始------------------ + +# Account目录 +sa_json_folder = r'/home/folderrclone/accounts' # 最后没有 '/' + +# Rclone移动的源和目的地 +rclone_src = '/home/tomove' +rclone_des = 'GDrive:/tmp' + +# 日志文件 +rclone_log_file = r'/tmp/rclone.log' # Rclone日志输出 +script_log_file = r'/tmp/autorclone.log' # 本脚本运行日志 + +# 本脚本临时文件 +instance_lock_path = r'/tmp/autorclone.lock' +instance_config_path = r'/tmp/autorclone.conf' + +# 检查rclone间隔 (s) +check_interval = 10 + +# 更换rclone帐号严格度 (1-3) 数字越大,监测越严格,建议为 2 +switch_sa_level = 2 + +# Rclone运行命令 +cmd_rclone = [ + # Rclone基本命令 + 'rclone', 'move', rclone_src, rclone_des, + # 基本配置项(不要改动) + '--drive-server-side-across-configs', # 优先使用Server Side + '-rc', # 启用rc模式,此项不可以删除,否则无法正确切换 + '-v', '--log-file', rclone_log_file, + # 其他配置项,默认不启用,为rclone默认参数,请根据自己需要更改 + # '--ignore-existing', + # '--fast-list', + # '--tpslimit 6', + # '--transfers 12', + # '--drive-chunk-size 32M', + # '--drive-acknowledge-abuse', +] + +# ------------配置项结束------------------ + +instance_config = {} +sa_jsons = [] + + +def write_config(name, value): + instance_config[name] = value + with open(instance_config_path, 'w') as f: + json.dump(instance_config, f, sort_keys=True) + + +# 获得下一个Service Account Credentials JSON file path +def get_next_sa_json_path(_last_sa): + if _last_sa not in sa_jsons: + next_sa_index = 0 + else: + _last_sa_index = sa_jsons.index(_last_sa) + next_sa_index = _last_sa_index + 1 + + # 超过列表长度从头开始取 + if next_sa_index > len(sa_jsons): + next_sa_index = next_sa_index - len(sa_jsons) + return sa_jsons[next_sa_index] + + +if __name__ == '__main__': + # 单例模式 ( ̄y▽, ̄)╭ + instance_check = filelock.FileLock(instance_lock_path) + with instance_check.acquire(timeout=0): + # 加载account信息 + sa_jsons = glob.glob(os.path.join(sa_json_folder, '*.json')) + if len(sa_jsons) == 0: + raise RuntimeError('No Service Account Credentials JSON file exists') + + # 加载instance配置 + if os.path.exists(instance_config_path): + config_raw = open(instance_config_path).read() + instance_config = json.loads(config_raw) + + if switch_sa_level > 2 or switch_sa_level < 0: + switch_sa_level = 2 + + # 如果有的话,重排sa_jsons + last_sa = instance_config.get('last_sa', '') + if last_sa in sa_jsons: + last_sa_index = sa_jsons.index(last_sa) + sa_jsons = sa_jsons[last_sa_index:] + sa_jsons[:last_sa_index] + + # 帐号切换循环 + while True: + current_sa = get_next_sa_json_path(last_sa) + write_config('last_sa', current_sa) + + # 起一个subprocess调rclone,并附加'--drive-service-account-file'参数 + cmd_rclone_current_sa = cmd_rclone + ['--drive-service-account-file', current_sa] + p = subprocess.Popen(cmd_rclone_current_sa, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE) + + # 主进程使用 `rclone rc core/stats` 检查子进程情况 + cnt_error = 0 + cnt_403_retry = 0 + cnt_transfer_last = 0 + cmd_stats = 'rclone rc core/stats' + while True: + try: + response = subprocess.check_output(cmd_stats, shell=True) + except subprocess.CalledProcessError as error: + cnt_error = cnt_error + 1 + if cnt_error >= 3: + p.kill() + raise RuntimeError('Rclone may finish all work and exit, for 3 times check core/stats failed.') + continue + else: + cnt_error = 0 + + response_processed = response.decode('utf-8').replace('\0', '') + response_processed_json = json.loads(response_processed) + + should_switch = 0 + + # 比较两次检查期间rclone传输的量 + cnt_transfer = response_processed_json['bytes'] + if cnt_transfer - cnt_transfer_last == 0: # 未增加 + cnt_403_retry += 1 + if cnt_403_retry > 200: # 超过200次检查均未增加 + should_switch += 1 + else: + cnt_403_retry = 0 + cnt_transfer_last = cnt_transfer + + # 检查当前transferring的传输量 + graceful = True + for transfer in response_processed_json['transferring']: + if transfer['bytes'] != 0 and transfer['speed'] > 0: # 当前还有未完成的传输 + graceful = False + break + if graceful: + should_switch += 1 + + # 大于设置的更换级别 + if should_switch >= switch_sa_level: + p.kill() # 杀掉当前rclone进程 + break # 退出主进程监测循环,切换到下一个帐号 + + time.sleep(check_interval) diff --git a/counter.py b/counter.py deleted file mode 100644 index 6febd9c..0000000 --- a/counter.py +++ /dev/null @@ -1,59 +0,0 @@ -from google.oauth2.service_account import Credentials -import googleapiclient.discovery, json, progress.bar, socket, time, sys, glob - -stt = time.time() -fct = 0 -dct = 0 - -try: - sas = glob.glob('key.json') - sas.extend(glob.glob('controller/*.json')) - sas.extend(glob.glob('accounts/*.json')) - filename = sas[0] -except IndexError: - print('No Service Account Found.') - sys.exit(0) - -credentials = Credentials.from_service_account_file(filename, scopes=[ - "https://www.googleapis.com/auth/drive" -]) -drive = googleapiclient.discovery.build("drive", "v3", credentials=credentials) - -def ls(parent, searchTerms=""): - files = [] - resp = drive.files().list(q=f"'{parent}' in parents" + searchTerms, pageSize=1000, supportsAllDrives=True, includeItemsFromAllDrives=True).execute() - files += resp["files"] - while "nextPageToken" in resp: - resp = drive.files().list(q=f"'{parent}' in parents" + searchTerms, pageSize=1000, supportsAllDrives=True, includeItemsFromAllDrives=True, pageToken=resp["nextPageToken"]).execute() - files += resp["files"] - return files - -def lsd(parent): - return ls(parent, searchTerms=" and mimeType contains 'application/vnd.google-apps.folder'") - -def lsf(parent): - return ls(parent, searchTerms=" and not mimeType contains 'application/vnd.google-apps.folder'") - -def rs(source): - global fct, dct - - fs = lsf(source) - fct += len(fs) - - fd = lsd(source) - dct += len(fd) - for i in fd: - rs(i['id']) - -try: - sp = sys.argv[1] -except IndexError: - sp = input('Folder ID to scan? ').strip() - -print('Counting objects in %s' % sp) -rs(sp) -print('Objects: %d\nFolders: %d' % (fct,dct)) - -hours, rem = divmod((time.time() - stt),3600) -minutes, sec = divmod(rem,60) -print("Elapsed Time:\n{:0>2}:{:0>2}:{:05.2f}".format(int(hours),int(minutes),sec)) diff --git a/folderclone.py b/folderclone.py deleted file mode 100644 index 826f0e0..0000000 --- a/folderclone.py +++ /dev/null @@ -1,121 +0,0 @@ -""" -USAGE: -python3 folderclone.py starting_account source_dir destination_dir - -starting_account: -account number to start on, useful for having multiple clones use different accounts - -source_dir: -id of the source directory - -destination_dir: -id of the destination directory -""" - -from oauth2client.service_account import ServiceAccountCredentials -import googleapiclient.discovery, json, progress.bar, socket, time, sys - -cred_num = int(sys.argv[1]) - -credentials = ServiceAccountCredentials.from_json_keyfile_name("accounts/" + str(cred_num) + ".json", scopes=[ - "https://www.googleapis.com/auth/drive" -]) - -drive = googleapiclient.discovery.build("drive", "v3", credentials=credentials) - -logfile = open("log.txt", "w") - -def logwrite(logtext): - - logfile.write(time.strftime("%m-%d %H:%M:%S") + " " + logtext + "\n") - logfile.flush() - -def ls(parent, searchTerms="", fname=""): - files = [] - resp = drive.files().list(q=f"'{parent}' in parents" + searchTerms, pageSize=1000, supportsAllDrives=True, includeItemsFromAllDrives=True).execute() - files += resp["files"] - while "nextPageToken" in resp: - resp = drive.files().list(q=f"'{parent}' in parents" + searchTerms, pageSize=1000, supportsAllDrives=True, includeItemsFromAllDrives=True, pageToken=resp["nextPageToken"]).execute() - files += resp["files"] - return files - -def lsd(parent, fname=""): - - return ls(parent, searchTerms=" and mimeType contains 'application/vnd.google-apps.folder'", fname=fname) - -def lsf(parent, fname=""): - - return ls(parent, searchTerms=" and not mimeType contains 'application/vnd.google-apps.folder'", fname=fname) - -def copy(source, dest): - - global drive - global cred_num - - try: - copied_file = drive.files().copy(fileId=source, body={"parents": [dest]}, supportsAllDrives=True).execute() - except googleapiclient.errors.HttpError as e: - cred_num += 1 - if cred_num % 100 == 0: - cred_num += 1 - credentials = ServiceAccountCredentials.from_json_keyfile_name("accounts/" + str(cred_num) + ".json", scopes=[ - "https://www.googleapis.com/auth/drive" - ]) - drive = googleapiclient.discovery.build("drive", "v3", credentials=credentials) - logwrite("changed cred_num to " + str(cred_num)) - copy(source, dest) - except socket.timeout: - logwrite("timeout") - time.sleep(60) - copy(source, dest) - except Exception as e: - logwrite("error: " + str(e)) - try: - copy(source, dest) - except RecursionError as e: - logwrite("max recursion reached") - raise e - -def rcopy(source, dest, sname): - - global drive - global cred_num - - filestocopy = lsf(source, fname=sname) - if len(filestocopy) > 0: - pbar = progress.bar.Bar("copy " + sname, max=len(filestocopy)) - pbar.update() - logwrite("copy filedir " + sname) - for i in filestocopy: - - copy(i["id"], dest) - pbar.next() - - pbar.finish() - - else: - - logwrite("copy dirdir " + sname) - print("copy " + sname) - - folderstocopy = lsd(source) - for i in folderstocopy: - - resp = drive.files().create(body={ - "name": i["name"], - "mimeType": "application/vnd.google-apps.folder", - "parents": [dest] - }, supportsAllDrives=True).execute() - - rcopy(i["id"], resp["id"], i["name"]) - -print("copying files... eta 5 minutes") -logwrite("start copy") -try: - rcopy(str(sys.argv[2]), str(sys.argv[3]), "root") -except Exception as e: - logfile.close() - raise e -print("completed copy with account " + str(cred_num)) -logwrite("finish copy") -logfile.close() diff --git a/masshare.py b/masshare.py index ac8aeca..4212bc4 100644 --- a/masshare.py +++ b/masshare.py @@ -10,16 +10,20 @@ successful = [] -def _is_success(id,resp,exception): + +def _is_success(id, resp, exception): global successful if exception is None: successful.append(resp['emailAddress']) -def masshare(drive_id=None,path='accounts',token='token.pickle',credentials='credentials.json'): + +def masshare(drive_id=None, path='accounts', token='token.pickle', credentials='credentials.json'): global successful - SCOPES = ["https://www.googleapis.com/auth/drive","https://www.googleapis.com/auth/cloud-platform","https://www.googleapis.com/auth/iam"] + SCOPES = ["https://www.googleapis.com/auth/drive", + "https://www.googleapis.com/auth/cloud-platform", + "https://www.googleapis.com/auth/iam"] creds = None if exists(token): @@ -41,7 +45,7 @@ def masshare(drive_id=None,path='accounts',token='token.pickle',credentials='cre print('Fetching emails') for i in glob('%s/*.json' % path): - accounts_to_add.append(loads(open(i,'r').read())['client_email']) + accounts_to_add.append(loads(open(i, 'r').read())['client_email']) while len(successful) < len(accounts_to_add): print('Preparing %d members' % (len(accounts_to_add) - len(successful))) @@ -56,13 +60,14 @@ def masshare(drive_id=None,path='accounts',token='token.pickle',credentials='cre print('Adding') batch.execute() + if __name__ == '__main__': parse = ArgumentParser(description='A tool to add service accounts to a shared drive from a folder containing credential files.') - parse.add_argument('--path','-p',default='accounts',help='Specify an alternative path to the service accounts folder.') - parse.add_argument('--token',default='token.pickle',help='Specify the pickle token file path.') - parse.add_argument('--credentials',default='credentials.json',help='Specify the credentials file path.') + parse.add_argument('--path', '-p', default='accounts', help='Specify an alternative path to the service accounts folder.') + parse.add_argument('--token', default='token.pickle', help='Specify the pickle token file path.') + parse.add_argument('--credentials', default='credentials.json', help='Specify the credentials file path.') parsereq = parse.add_argument_group('required arguments') - parsereq.add_argument('--drive-id','-d',help='The ID of the Shared Drive.',required=True) + parsereq.add_argument('--drive-id', '-d', help='The ID of the Shared Drive.', required=True) args = parse.parse_args() masshare( drive_id=args.drive_id, diff --git a/multifactory.py b/multifactory.py index 577e085..9bb3d9a 100644 --- a/multifactory.py +++ b/multifactory.py @@ -8,48 +8,60 @@ from json import loads from time import sleep from glob import glob -import os,pickle +import os, pickle -SCOPES = ['https://www.googleapis.com/auth/drive','https://www.googleapis.com/auth/cloud-platform','https://www.googleapis.com/auth/iam'] +SCOPES = [ + 'https://www.googleapis.com/auth/drive', + 'https://www.googleapis.com/auth/cloud-platform', + 'https://www.googleapis.com/auth/iam' +] project_create_ops = [] current_key_dump = [] sleep_time = 30 # Create count SAs in project -def _create_accounts(service,project,count): +def _create_accounts(service, project, count): batch = service.new_batch_http_request(callback=_def_batch_resp) for i in range(count): aid = _generate_id('mfc-') - batch.add(service.projects().serviceAccounts().create(name='projects/' + project, body={ 'accountId': aid, 'serviceAccount': { 'displayName': aid }})) + batch.add(service.projects().serviceAccounts().create(name='projects/' + project, + body={'accountId': aid, + 'serviceAccount': {'displayName': aid} + })) batch.execute() + # Create accounts needed to fill project -def _create_remaining_accounts(iam,project): +def _create_remaining_accounts(iam, project): print('Creating accounts in %s' % project) - sa_count = len(_list_sas(iam,project)) + sa_count = len(_list_sas(iam, project)) while sa_count != 100: - _create_accounts(iam,project,100 - sa_count) - sa_count = len(_list_sas(iam,project)) + _create_accounts(iam, project, 100 - sa_count) + sa_count = len(_list_sas(iam, project)) + # Generate a random id def _generate_id(prefix='saf-'): chars = '-abcdefghijklmnopqrstuvwxyz1234567890' return prefix + ''.join(choice(chars) for _ in range(25)) + choice(chars[1:]) + # List projects using service def _get_projects(service): return [i['projectId'] for i in service.projects().list().execute()['projects']] + # Default batch callback handler -def _def_batch_resp(id,resp,exception): +def _def_batch_resp(id, resp, exception): if exception is not None: if str(exception).startswith(' 0: current_count = len(_get_projects(cloud)) if current_count + create_projects < max_projects: - print('Creating %d projects' % (create_projects)) + print('Creating %d projects' % create_projects) nprjs = _create_projects(cloud, create_projects) selected_projects = nprjs else: @@ -207,16 +228,16 @@ def serviceaccountfactory( ste = _get_projects(cloud) services = [i + '.googleapis.com' for i in services] print('Enabling services') - _enable_services(serviceusage,ste,services) + _enable_services(serviceusage, ste, services) if create_sas: stc = [] stc.append(create_sas) if create_sas == '~': stc = selected_projects elif create_sas == '*': - stc = _get_projects(cloud) + stc = _get_projects(cloud) for i in stc: - _create_remaining_accounts(iam,i) + _create_remaining_accounts(iam, i) if download_keys: try: os.mkdir(path) @@ -228,7 +249,7 @@ def serviceaccountfactory( std = selected_projects elif download_keys == '*': std = _get_projects(cloud) - _create_sa_keys(iam,std,path) + _create_sa_keys(iam, std, path) if delete_sas: std = [] std.append(delete_sas) @@ -238,24 +259,25 @@ def serviceaccountfactory( std = _get_projects(cloud) for i in std: print('Deleting service accounts in %s' % i) - _delete_sas(iam,i) + _delete_sas(iam, i) + if __name__ == '__main__': parse = ArgumentParser(description='A tool to create Google service accounts.') - parse.add_argument('--path','-p',default='accounts',help='Specify an alternate directory to output the credential files.') - parse.add_argument('--token',default='token.pickle',help='Specify the pickle token file path.') - parse.add_argument('--credentials',default='credentials.json',help='Specify the credentials file path.') - parse.add_argument('--list-projects',default=False,action='store_true',help='List projects viewable by the user.') - parse.add_argument('--list-sas',default=False,help='List service accounts in a project.') - parse.add_argument('--create-projects',type=int,default=None,help='Creates up to N projects.') - parse.add_argument('--max-projects',type=int,default=12,help='Max amount of project allowed. Default: 12') - parse.add_argument('--enable-services',default=None,help='Enables services on the project. Default: IAM and Drive') - parse.add_argument('--services',nargs='+',default=['iam','drive'],help='Specify a different set of services to enable. Overrides the default.') - parse.add_argument('--create-sas',default=None,help='Create service accounts in a project.') - parse.add_argument('--delete-sas',default=None,help='Delete service accounts in a project.') - parse.add_argument('--download-keys',default=None,help='Download keys for all the service accounts in a project.') - parse.add_argument('--quick-setup',default=None,type=int,help='Create projects, enable services, create service accounts and download keys. ') - parse.add_argument('--new-only',default=False,action='store_true',help='Do not use exisiting projects.') + parse.add_argument('--path', '-p', default='accounts', help='Specify an alternate directory to output the credential files.') + parse.add_argument('--token', default='token.pickle', help='Specify the pickle token file path.') + parse.add_argument('--credentials', default='credentials.json', help='Specify the credentials file path.') + parse.add_argument('--list-projects', default=False, action='store_true', help='List projects viewable by the user.') + parse.add_argument('--list-sas', default=False, help='List service accounts in a project.') + parse.add_argument('--create-projects', type=int, default=None, help='Creates up to N projects.') + parse.add_argument('--max-projects', type=int, default=12, help='Max amount of project allowed. Default: 12') + parse.add_argument('--enable-services', default=None, help='Enables services on the project. Default: IAM and Drive') + parse.add_argument('--services', nargs='+', default=['iam', 'drive'], help='Specify a different set of services to enable. Overrides the default.') + parse.add_argument('--create-sas', default=None, help='Create service accounts in a project.') + parse.add_argument('--delete-sas', default=None, help='Delete service accounts in a project.') + parse.add_argument('--download-keys', default=None, help='Download keys for all the service accounts in a project.') + parse.add_argument('--quick-setup', default=None, type=int, help='Create projects, enable services, create service accounts and download keys. ') + parse.add_argument('--new-only', default=False, action='store_true', help='Do not use exisiting projects.') args = parse.parse_args() # If credentials file is invalid, search for one. if not os.path.exists(args.credentials): @@ -266,9 +288,9 @@ def serviceaccountfactory( else: i = 0 print('Select a credentials file below.') - inp_options = [str(i) for i in list(range(1,len(options) + 1))] + options + inp_options = [str(i) for i in list(range(1, len(options) + 1))] + options while i < len(options): - print(' %d) %s' % (i + 1,options[i])) + print(' %d) %s' % (i + 1, options[i])) i += 1 inp = None while True: @@ -284,7 +306,7 @@ def serviceaccountfactory( opt = '*' if args.new_only: opt = '~' - args.services = ['iam','drive'] + args.services = ['iam', 'drive'] args.create_projects = args.quick_setup args.enable_services = opt args.create_sas = opt @@ -313,8 +335,8 @@ def serviceaccountfactory( print('No projects.') elif args.list_sas: if resp: - print('Service accounts in %s (%d):' % (args.list_sas,len(resp))) + print('Service accounts in %s (%d):' % (args.list_sas, len(resp))) for i in resp: - print(' %s (%s)' % (i['email'],i['uniqueId'])) + print(' %s (%s)' % (i['email'], i['uniqueId'])) else: print('No service accounts.') diff --git a/multifolderclone.py b/multifolderclone.py deleted file mode 100644 index 7831211..0000000 --- a/multifolderclone.py +++ /dev/null @@ -1,302 +0,0 @@ -from google.oauth2.service_account import Credentials -from googleapiclient.errors import HttpError -from urllib3.exceptions import ProtocolError -from googleapiclient.discovery import build -from argparse import ArgumentParser -from httplib2shim import patch -from glob import glob -import time,threading,json,socket - -account_count = 0 -dtu = 1 -retry = [] -drive = [] -threads = None -bad_drives = [] - -error_codes = { - 'dailyLimitExceeded': True, - 'userRateLimitExceeded': True, - 'rateLimitExceeded': True, - 'sharingRateLimitExceeded': True, - 'appNotAuthorizedToFile': True, - 'insufficientFilePermissions': True, - 'domainPolicy': True, - 'backendError': True, - 'internalError': True, - 'badRequest': False, - 'invalidSharingRequest': False, - 'authError': False, - 'notFound': False -} - -patch() - -def log(*l): - global debug - if debug: - for i in l: - print(i) - -def apicall(request,sleep_time=1,max_retries=3): - global error_codes - - resp = None - tries = 0 - - while True: - tries += 1 - if tries > max_retries: - return None - try: - resp = request.execute() - if tries > 1: - log('Successfully retried') - except HttpError as error: - log(error) - try: - error_details = json.loads(error.content.decode("utf-8")) - except json.decoder.JSONDecodeError: - time.sleep(sleep_time) - continue - reason = error_details["error"]["errors"][0]["reason"] - if reason == 'userRateLimitExceeded': - return False - elif error_codes[reason]: - time.sleep(sleep_time) - continue - else: - return None - except (socket.error, ProtocolError): - time.sleep(sleep_time) - continue - else: - return resp - -def ls(parent, searchTerms=""): - files = [] - - resp = apicall( - drive[0].files().list( - q="'" + parent + "' in parents" + searchTerms, - fields='files(md5Checksum,id,name),nextPageToken', - pageSize=1000, - supportsAllDrives=True, - includeItemsFromAllDrives=True - ) - ) - files += resp["files"] - - while "nextPageToken" in resp: - resp = apicall( - drive[0].files().list( - q="'" + parent + "' in parents" + searchTerms, - fields='files(md5Checksum,id,name),nextPageToken', - pageSize=1000, - supportsAllDrives=True, - includeItemsFromAllDrives=True, - pageToken=resp["nextPageToken"] - ) - ) - files += resp["files"] - return files - -def lsd(parent): - return ls( - parent, - searchTerms=" and mimeType contains 'application/vnd.google-apps.folder'" - ) - -def lsf(parent): - return ls( - parent, - searchTerms=" and not mimeType contains 'application/vnd.google-apps.folder'" - ) - -def copy(driv, source, dest): - global bad_drives - global retry - if apicall(driv.files().copy(fileId=source, body={"parents": [dest]}, supportsAllDrives=True)) == False: - bad_drives.append(driv) - retry.append((source,dest)) - threads.release() - -def rcopy(drive, dtu, source, dest, sname, pre, width): - global threads - global retry - global bad_drives - - pres = pre - files_source = lsf(source) - files_dest = lsf(dest) - folders_source = lsd(source) - folders_dest = lsd(dest) - files_to_copy = [] - files_source_id = [] - files_dest_id = [] - - fs = len(folders_source) - 1 - - folders_copied = {} - for file in files_source: - files_source_id.append(dict(file)) - file.pop('id') - for file in files_dest: - files_dest_id.append(dict(file)) - file.pop('id') - - i = 0 - while len(files_source) > i: - if files_source[i] not in files_dest: - files_to_copy.append(files_source_id[i]) - i += 1 - for i in retry: - threads.acquire() - thread = threading.Thread( - target=copy, - args=( - drive[dtu], - i[0], - i[1] - ) - ) - thread.start() - dtu += 1 - if dtu > len(drive) - 1: - dtu = 1 - retry = [] - if len(files_to_copy) > 0: - for file in files_to_copy: - threads.acquire() - thread = threading.Thread( - target=copy, - args=( - drive[dtu], - file['id'], - dest - ) - ) - thread.start() - dtu += 1 - if dtu > len(drive) - 1: - dtu = 1 - print(pres + sname + ' | Synced') - elif len(files_source) > 0 and len(files_source) <= len(files_dest): - print(pres + sname + ' | Up to date') - else: - print(pres + sname) - log(len(bad_drives),bad_drives) - log(len(drive)) - for i in bad_drives: - if i in drive: - drive.remove(i) - bad_drives = [] - if len(drive) == 1: - print('Out of SAs.') - return - - for i in folders_dest: - folders_copied[i['name']] = i['id'] - - s = 0 - for folder in folders_source: - if s == fs: - nstu = pre.replace("├" + "─" * width + " ", "│" + " " * width + " ").replace("└" + "─" * width + " ", " " + " " * width) + "└" + "─" * width + " " - else: - nstu = pre.replace("├" + "─" * width + " ", "│" + " " * width + " ").replace("└" + "─" * width + " ", " " + " " * width) + "├" + "─" * width + " " - if folder['name'] not in folders_copied.keys(): - folder_id = apicall( - drive[0].files().create( - body={ - "name": folder["name"], - "mimeType": "application/vnd.google-apps.folder", - "parents": [dest] - }, - supportsAllDrives=True - ) - )['id'] - else: - folder_id = folders_copied[folder['name']] - drive = rcopy( - drive, - dtu, - folder["id"], - folder_id, - folder["name"].replace('%', "%%"), - nstu, - width - ) - s += 1 - return drive - -def multifolderclone(source=None, dest=None, path='accounts', width=2, thread_count=None): - global account_count - global drive - global threads - - stt = time.time() - accounts = glob(path + '/*.json') - - check = build("drive", "v3", credentials=Credentials.from_service_account_file(accounts[0])) - try: - root_dir = check.files().get(fileId=source, supportsAllDrives=True).execute()['name'] - except HttpError: - print('Source folder cannot be read or is invalid.') - exit(0) - try: - dest_dir = check.files().get(fileId=dest, supportsAllDrives=True).execute()['name'] - except HttpError: - print('Destination folder cannot be read or is invalid.') - exit(0) - - print('Copy from ' + root_dir + ' to ' + dest_dir + '.') - print('View set to tree (' + str(width) + ').') - - print("Creating %d Drive Services" % len(accounts)) - for account in accounts: - account_count += 1 - credentials = Credentials.from_service_account_file(account, scopes=[ - "https://www.googleapis.com/auth/drive" - ]) - drive.append(build("drive", "v3", credentials=credentials)) - if thread_count is not None and thread_count <= account_count: - threads = threading.BoundedSemaphore(thread_count) - else: - threads = threading.BoundedSemaphore(account_count) - - print('BoundedSemaphore with %d threads' % account_count) - - try: - rcopy(drive, 1, source, dest, root_dir, "", width) - except KeyboardInterrupt: - print('Quitting') - exit(0) - - print('Complete.') - hours, rem = divmod((time.time() - stt), 3600) - minutes, sec = divmod(rem, 60) - print("Elapsed Time:\n{:0>2}:{:0>2}:{:05.2f}".format(int(hours), int(minutes), sec)) - -def main(): - global debug - parse = ArgumentParser(description='A tool intended to copy large files from one folder to another.') - parse.add_argument('--width', '-w', type=int, default=2, help='Set the width of the view option.') - parse.add_argument('--path', '-p', default='accounts', help='Specify an alternative path to the service accounts.') - parse.add_argument('--debug-mode',default=False,action='store_true',help='Completely verbose.') - parse.add_argument('--threads', type=int, default=None,help='Specify a different thread count. Cannot be greater than the amount of service accounts available.') - parsereq = parse.add_argument_group('required arguments') - parsereq.add_argument('--source-id', '-s',help='The source ID of the folder to copy.',required=True) - parsereq.add_argument('--destination-id', '-d',help='The destination ID of the folder to copy to.',required=True) - args = parse.parse_args() - debug = args.debug_mode - multifolderclone( - source=args.source_id, - dest=args.destination_id, - path=args.path, - width=args.width, - thread_count=args.threads - ) - -if __name__ == '__main__': - main() - diff --git a/remove.py b/remove.py index ac48ab1..894ffc0 100644 --- a/remove.py +++ b/remove.py @@ -7,17 +7,20 @@ import pickle to_be_removed = [] -def _is_success(id,resp,exception): + + +def _is_success(id, resp, exception): global to_be_removed if exception is not None: - exp = str(exception).split('?')[0].split('/') - if exp[0].startswith(' 0: print('Removing %d members.' % len(to_be_removed)) - tbr = [ to_be_removed[i:i + 100] for i in range(0, len(to_be_removed), 100) ] + tbr = [to_be_removed[i:i + 100] for i in range(0, len(to_be_removed), 100)] to_be_removed = [] for j in tbr: - batch = drive.new_batch_http_request(callback=_is_success) - for i in j: - batch.add(drive.permissions().delete(fileId=drive_id,permissionId=i,supportsAllDrives=True)) - batch.execute() + batch = drive.new_batch_http_request(callback=_is_success) + for i in j: + batch.add(drive.permissions().delete(fileId=drive_id, permissionId=i, supportsAllDrives=True)) + batch.execute() print('Users removed.') + if __name__ == '__main__': parse = ArgumentParser(description='A tool to remove users from a Shared Drive.') - parse.add_argument('--token',default='token.pickle',help='Specify the pickle token file path.') - parse.add_argument('--credentials',default='credentials.json',help='Specify the credentials file path.') + parse.add_argument('--token', default='token.pickle', help='Specify the pickle token file path.') + parse.add_argument('--credentials', default='credentials.json', help='Specify the credentials file path.') oft = parse.add_mutually_exclusive_group(required=True) - oft.add_argument('--prefix',help='Remove users that match a prefix.') - oft.add_argument('--suffix',help='Remove users that match a suffix.') - oft.add_argument('--role',help='Remove users based on permission roles.') + oft.add_argument('--prefix', help='Remove users that match a prefix.') + oft.add_argument('--suffix', help='Remove users that match a suffix.') + oft.add_argument('--role', help='Remove users based on permission roles.') parsereq = parse.add_argument_group('required arguments') - parsereq.add_argument('--drive-id','-d',help='The ID of the Shared Drive.',required=True) + parsereq.add_argument('--drive-id', '-d', help='The ID of the Shared Drive.', required=True) args = parse.parse_args() remove( diff --git a/requirements.txt b/requirements.txt index 336e438..ae20c34 100644 --- a/requirements.txt +++ b/requirements.txt @@ -3,3 +3,4 @@ urllib3==1.24.2 httplib2shim==0.0.3 protobuf==3.9.2 google_api_python_client==1.7.11 +filelock==3.0.12 \ No newline at end of file