-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Asian training dataset(from glint) discussion. #256
Comments
Thanks for Sharing |
@nttstar will you train new models with these data? |
业界良心 |
有格林的人在吗,我是楼下比特大陆的,下载速度太慢了,可以直接去楼上直接拷贝吗? |
is there exist any same person in msra & celebrity datasets? |
前几天刚听了格林的讲座,公开了这个数据集,数据集刚下载好,几百个GB,没想到这里这么快就出现了,感谢 |
After test, this dataset is pretty clean, but still containing 0.3%~0.8% noise. |
@aa12356jm could you share it on BaiduYun? |
@nttstar I download the dataset from glint, it looks like the face is similarity transformed and resized to 400x400, so for arcface, how to crop/resize this to 112x112? |
@zhenglaizhang I already provided the scripts. |
@JianbangZ do you have some idea to solve these problems? |
awesome ! |
Thanks DeepGlint! |
有意义 |
@nttstar the download address is crashed. |
@nttstar @JianbangZ how can you download glint asian face dataset? I can not find how to register and signup. |
@nttstar @JianbangZ 为何我下载的亚洲人脸数据集只能解压出1.7G 2000+id的人脸 这个90+G的.tar.gz文件该怎么处理呢 能否指导一下 多谢 |
there is no lmk files in the dataset: |
The same problem with @libohit , I can't sign in |
@Wisgon maybe you need to use another browser |
Can anyone share a copy of lmk files? Their official site seems to be maintaining. I cound download nothing. |
现在下载不了,是什么情况@—@ |
I can't download the dataset, when I click the Download button, there is some error appear: |
I can't sign in http://trillionpairs.deepglint.com/data, the button of "sign in" is dark! |
@shineway14 You can use http://trillionpairs.deepglint.com/login to sign in, when you finsh fill in the blanks, press enter instead of the 'log in' button. |
@nttstar what is the exactly script to merge msra and celeb? |
@aaaaaaaak 我最新用BT下载的亚洲人脸数据集是正常的,和官方提供的数据一致,我给你参考下我的 |
HI all, OpenCV Error: Assertion failed (src.cols > 0 && src.rows > 0) in warpAffine, file /build/buildd/opencv-2.4.8+dfsg1/modules/imgproc/src/imgwarp.cpp, line 3445
Traceback (most recent call last):
File "face2rec2.py", line 256, in <module>
image_encode(args, i, item, q_out)
File "face2rec2.py", line 99, in image_encode
img = face_preprocess.preprocess(img, bbox = item.bbox, landmark=item.landmark, image_size='%d,%d'%(args.image_h, args.image_w))
File "../common/face_preprocess.py", line 107, in preprocess
warped = cv2.warpAffine(img,M,(image_size[1],image_size[0]), borderValue = 0.0)
cv2.error: /build/buildd/opencv-2.4.8+dfsg1/modules/imgproc/src/imgwarp.cpp:3445: error: (-215) src.cols > 0 && src.rows > 0 in function warpAffine |
@jackytu256 请问你下载的包里面有lmk文件么? |
Hello @Talgin emore is based on MSCELEB just like non asian component of faces_glint. I would merge emore with asia part only, but I could be wrong. |
@mlourencoeb, |
@zhouwei5113 have you solved your problem? I got also really low score on trillionpairs. |
@nttstar 下面是对应pytorch 实现 class SpatialGroupEnhance(nn.Module): # 3 2 1 hw is half, 311 is same size
|
@nttstar testing verification.. |
any conclusion about thedataset ? Is face_glint = emore + asian_celeb? |
Hi @nttstar , network = r100 We are using 4 Tesla P100 GPU's. @nttstar could you tell us what is the problem? We have merged the datasets according to your instructions with dataset_merge.py and no error happened :) |
Hi @SueeH , They say that face_glint (DeepGlint-Face) includes MS1M-DeepGlint and Asian-DeepGlint. As far as I know and reading this (https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8698884) MS1M-DeepGlint is refined version of MS1M (provided by DeepGlint Corp.) and on http://trillionpairs.deepglint.com/overview they say:
So, I think that emore (MS1MV2) is another refined version of what is included into faces_glint dataset from MS1M (because MS1M-DeepGlint has 2K more ids than MS1MV2, but less images (3.9M to 5.8M)). |
兄弟,我也上海的,MobileFaceNet+arcloss训练webface数据集或face-ms1m总是会Nan,不知道你试过没有,即便lr调成0.0001,20几轮后(epoch 等于24的时候)就Nan了。 |
Anyone can share configure training Asian Faces ? thanks |
I did step by step but get error about key image : but get key error for asian dataset: |
@Edwardmark I meet the same problem with you. Did you get good results on deepglint at last? |
@maywander no, I didn't. At last , I use the emore data instead. |
so the models trained from emore perform better on trillionpairs test platform?@Edwardmark |
@maywander yes, and I don't know why. |
能正常生产glint.lst文件,但是调用face2rec.py总出错,请问有人知道怎么设置参数么?谢谢 |
感觉代码有问题 |
No such file or directory: '..../insightface/src/data/property' |
@nttstar I use glint dataset to train the model but only get 77% acc in the glint test, could you share your train log which can get 86% acc. |
How many iterations does it take to train this combined dataset from scratch using the any provided models until it converges? |
Thanks for valuate discussion, anyone has improvement in Megaface and IJBC when working in the merged dataset? Thanks |
@nttstar Thanks for the great work. @mlourencoeb Thanks in advance. |
Can you share me src/ source code folder. I cannot find it. Thank you |
Hi @Talgin, I trained a model with Asia faces but I got the same error, how can you solve it? |
I have the same problem. Have you found it? |
[Asian-celeb dataset]
The dataset consists of the crawled images of celebrities on the he web.The ima images are covered under a Creative Commons Attribution-NonCommercial 4.0 International license (Please read the license terms here. e. http://creativecommons.org/licenses/by-nc/4.0/). [train_msra.tar.gz] MD5:c5b668f2204c400099b14f367069aef5 Content: Train dataset called MS-Celeb-1M-v1c with 86,876 ids/3,923,399 aligned images cleaned from MS-Celeb-1M dataset. This dataset has been excluded from both LFW and Asian-Celeb. Format: *.jpg Google: https://drive.google.com/file/d/1aaPdI0PkmQzRbWErazOgYtbLA1mwJIfK/view?usp=sharing[msra_lmk.tar.gz] MD5:7c053dd0462b4af243bb95b7b31da6e6 Content: A list of five-point landmarks for the 3,923,399 images in MS-Celeb-1M-v1c. Format: ..... while is the path of images in tar file train_msceleb.tar.gz. Label is an integer ranging from 0 to 86,875. (x,y) is the coordinate of a key point on the aligned images. left eye Google: https://drive.google.com/file/d/1FQ7P4ItyKCneNEvYfJhW2Kff7cOAFpgk/view?usp=sharing[train_celebrity.tar.gz] MD5:9f2e9858afb6c1032c4f9d7332a92064 Content: Train dataset called Asian-Celeb with 93,979 ids/2,830,146 aligned images. This dataset has been excluded from both LFW and MS-Celeb-1M-v1c. Format: *.jpg Google: https://drive.google.com/file/d/1-p2UKlcX06MhRDJxJukSZKTz986Brk8N/view?usp=sharing[celebrity_lmk.tar.gz] MD5:9c0260c77c13fbb32692fc06a5dbfaf0 Content: A list of five-point landmarks for the 2,830,146 images in Asian-Celeb. Format: ..... while is the path of images in tar file train_celebrity.tar.gz. Label is an integer ranging from 86,876 to 196,319. (x,y) is the coordinate of a key point on the aligned images. left eye Google: https://drive.google.com/file/d/1sQVV9epoF_8jS3ge6DqbilpWk3UNE8U7/view?usp=sharing[testdata.tar.gz] MD5:f17c4712f7562ea6d45f0a158e59b792 Content: Test dataset with 1,862,120 aligned images. Format: *.jpg Google: https://drive.google.com/file/d/1ghzuEQqmUFN3nVujfrZfBx_CeGUpWzuw/view?usp=sharing[testdata_lmk.tar] MD5:7e4995eb9976a2cfd2b23db05d76572c Content: A list of five-point landmarks for the 1,862,120 images in testdata.tar.gz. Features should be extracted in the same sequence and with the same amount with this list. Format: ..... while is the path of images in tar file testdata.tar.gz. (x,y) is the coordinate of a key point on the aligned images. left eye Google: https://drive.google.com/file/d/1lYzqnPyHXRVgXJYbEVh6zTXn3Wq4JO-I/view?usp=sharing[feature_tools.tar.gz] MD5:227b069d7a83aa43b0cb738c2252dbc4 Content: Feature format transform tool and a sample feature file. Format: We use the same format as Megaface(http://megaface.cs.washington.edu/) except that we merge all files into a single binary file. Google: https://drive.google.com/file/d/1bjZwOonyZ9KnxecuuTPVdY95mTIXMeuP/view?usp=sharing |
I have the same issues. :( Cannot find the .py files that help us working with the dataset given on Baiducloud. |
msra
is a cleaned subset of MS1M from glint whilecelebrity
is the asian dataset.src/data/glint2lst.py
. For example:or generate the asian dataset only by:
src/data/dataset_merge.py
without setting param model which will combine all IDs from those two datasets.Finally you will get a dataset contains about 180K IDs.
Use
src/eval/gen_glint.py
to prepare test feature file by using pretrained insightface model.You can also post your private testing results here.
The text was updated successfully, but these errors were encountered: