-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.html
1305 lines (1110 loc) · 110 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html>
<html>
<head>
<!-- Global site tag (gtag.js) - Google Analytics -->
<script async src="https://www.googletagmanager.com/gtag/js?id=UA-119544230-1"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'UA-119544230-1');
</script>
<meta name="description" content="Associate Professor at University College London (UCL) in machine learning, structured prediction and multitask learning, with experience in computer vision and robotics. Carlo Ciliberto was Lecturer at Imperial College London and previouly postdoc at the Poggio lab at the Massachusetts Institute of Technology (MIT) and did his PhD at the Istituto Italiano di Tecnologia (IIT).">
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<!-- <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css" type="text/css"> -->
<link rel="stylesheet" href="https://use.fontawesome.com/releases/v5.0.13/css/all.css" integrity="sha384-DNOHZ68U8hZfKXOrtjWvjxusGo9WQnrNx2sqG0tfsghAvtVlRW3tvkXWZh58N9jp" crossorigin="anonymous">
<link rel="stylesheet" href="prova1.css">
<link rel="apple-touch-icon" sizes="180x180" href="favicon/apple-touch-icon.png">
<link rel="icon" type="image/png" sizes="32x32" href="favicon/favicon-32x32.png">
<link rel="icon" type="image/png" sizes="16x16" href="favicon/favicon-16x16.png">
<link rel="manifest" href="/site.webmanifest">
<link rel="mask-icon" href="favicon/safari-pinned-tab.svg" color="#5bbad5">
<meta name="msapplication-TileColor" content="#da532c">
<meta name="theme-color" content="#ffffff">
<title>Carlo Ciliberto - Lecturer in Machine Learning</title>
<!-- Linkeding Badge -->
<script type="text/javascript" src="https://platform.linkedin.com/badges/js/profile.js" async defer></script>
</head>
<body>
<nav class="navbar navbar-expand-md bg-dark navbar-dark sticky-top" id="s_navbar1">
<div class="container">
<!-- <img class="img-fluid d-block" src="name.png" width="10%" height="10%"> -->
<button class="navbar-toggler navbar-toggler-right" type="button" data-toggle="collapse" data-target="#navbar2SupportedContent" aria-controls="navbar2SupportedContent" aria-expanded="false" aria-label="Toggle navigation"> <span class="navbar-toggler-icon"> </span>Carlo Ciliberto</button>
<div class="collapse navbar-collapse text-center justify-content-end" id="navbar2SupportedContent">
<ul class="navbar-nav">
<li class="nav-item">
<a class="nav-link" href="#publications">Publications</a>
</li>
<li class="nav-item">
<a class="nav-link" href="https://github.com/cciliber">Software</a>
</li>
<li class="nav-item">
<a class="nav-link" href="https://cciliber.github.io/intro-slt/">Teaching</a>
</li>
<li class="nav-item">
<a class="nav-link" href="#contacts">Contacts</a>
</li>
<li class="nav-item">
<a class="nav-link" href="carlo_ciliberto_cv.pdf">CV</a>
</li>
</ul>
</div>
</div>
</nav>
<div class="pt-3 pb-1 text-white bg-primary" id="s_cover3">
<div class="container">
<div class="row">
<div class="text-md-left text-center align-self-center my-2 col-md-7">
<h1 class="display-3 text-right">Carlo Ciliberto</h1>
<p class="lead text-right"> Associate Professor in Machine Learning
<br> University College London </p>
</div>
<div class="col-md-1">
</div>
<div class="col-md-2">
<img id="c_img" alt="Carlo Ciliberto - Researcher in Machine Learning, Structured Prediction, Multitask Learning, Computer Vision and Robotics" class="img-fluid d-block rounded-circle mx-auto" src="carlo_ciliberto.jpg"> </div>
</div>
</div>
</div>
<div class="py-5">
<div class="container">
<div id="c_row-2col-a" class="row">
<div class="col-md-12" id="publications">
<div id="template-pub">
<div class="btn-light alert py-2 my-2 border border-secondary" onclick="$(this).next().toggle()">
<p class="lead py-0 my-0"> <span class="alert alert-success p-1 text-success lead" type="tipo" style="font-size:smaller"> <span class="venue" style="font-weight: bolder;">NIPS</span> <span class="date">2017</span></span> <span class="title" style="display:inline-block">Prova</span>
<br> <span class="authors" style="font-size:smaller; font-style:italic; font-weight: bolder;">Autori</span>
<a class="checkempty link-pdf px-2" href="" style="display:inline; float:right"><i class="far fa-file-pdf"></i></a>
<a class="link-code checkempty px-2" style="display:inline; float:right" href=""><i class="fas fa-code"></i></a>
<a class="link-video checkempty px-2" href="" style="display:inline; float:right"><i class="fa fa-film"></i></a>
<!-- <i class="far fa-images"> -->
<a class="link-slides checkempty px-2" href="" style="display:inline; float:right;">slides</i></a>
</p>
</div>
<div class="addInfo alert border border-secondary">
<div style="display:block; width:100%">
<img class="picture" style="width:40%; display:inline; float:right" src=""> </div>
<p class="abstract text-justify">Abstract</p>
<pre class="bibtex" style="font-size:smaller"> bibtex
</pre> </div>
</div>
<div id="pub-list-container">
<h1 class="" id="c_heading" style="display:inline">Publications
<input class="search form-control" placeholder="Search" style="width:38.2%; display:inline; float:right"> </h1>
<div class="list"> </div>
</div>
<div id="list-source-container">
<ul class="list">
<li>
<h3 class="title">Distribution Regression with Sliced Wasserstein Kernels</h3> <span class="authors">Dimitri Meunier, Massimiliano Pontil, Carlo Ciliberto</span> <span class="venue type relevance addmsg" type="conference" relevance="3" alert="ORAL">ICML</span> <span class="date">2022</span>
<p class="abstract">The problem of learning functions over spaces of probabilities - or distribution regression - is gaining significant interest in the machine learning community. A key challenge behind this problem is to identify a suitable representation capturing all relevant properties of the underlying functional mapping. A principled approach to distribution regression is provided by kernel mean embeddings, which lifts kernel-induced similarity on the input domain at the probability level. This strategy effectively tackles the two-stage sampling nature of the problem, enabling one to derive estimators with strong statistical guarantees, such as universal consistency and excess risk bounds. However, kernel mean embeddings implicitly hinge on the maximum mean discrepancy (MMD), a metric on probabilities, which may fail to capture key geometrical relations between distributions. In contrast, optimal transport (OT) metrics, are potentially more appealing, as documented by the recent literature on the topic. In this work, we propose the first OT-based estimator for distribution regression. We build on the Sliced Wasserstein distance to obtain an OT-based representation. We study the theoretical properties of a kernel ridge regression estimator based on such representation, for which we prove universal consistency and excess risk bounds. Preliminary experiments complement our theoretical findings by showing the effectiveness of the proposed approach and compare it with MMD-based estimators.</p>
<a class="link-pdf" href="https://arxiv.org/pdf/2202.03926.pdf">pdf</a>
<a class="link-code" href="">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" alt="" src="papers/distribution-regression-sliced22/distribution-regression-sliced22.png">
<pre class="bibtex">@inproceedings{meunier2022distribution,
title={Distribution Regression with Sliced Wasserstein Kernels},
author={Meunier, Dimitri and Pontil, Massimiliano and Ciliberto, Carlo},
booktitle={International Conference on Machine Learning},
year={2022}
}
</pre>
<p class="tags">Distribution Regression, Kernels, Statistical Learning Theory, Wasserstein, Optimal Transport</p>
</li>
<li>
<h3 class="title">Measuring Dissimilarity with Diffeomorphism Invariance</h3> <span class="authors">Théophile Cantelobre, Carlo Ciliberto, Benjamin Guedj, Alessandro Rudi</span> <span class="venue type relevance addmsg" type="conference" relevance="3" alert="ORAL">ICML</span> <span class="date">2022</span>
<p class="abstract">Measures of similarity (or dissimilarity) are a key ingredient to many machine learning algorithms. We introduce DID, a pairwise dissimilarity measure applicable to a wide range of data spaces, which leverages the data's internal structure to be invariant to diffeomorphisms. We prove that DID enjoys properties which make it relevant for theoretical study and practical use. By representing each datum as a function, DID is defined as the solution to an optimization problem in a Reproducing Kernel Hilbert Space and can be expressed in closed-form. In practice, it can be efficiently approximated via Nystr\"om sampling. Empirical experiments support the merits of DID,</p>
<a class="link-pdf" href="https://arxiv.org/pdf/2202.05614.pdf">pdf</a>
<a class="link-code" href="">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" alt="" src="">
<pre class="bibtex">@inproceedings{cantelobre2022measuring,
title={Measuring dissimilarity with diffeomorphism invariance},
author={Cantelobre, Théophile and Guedj, Benjamin and Ciliberto, Carlo and Rudi, Alessandro},
booktitle={International Conference on Machine Learning},
year={2022}
}
</pre>
<p class="tags">Dissimilarity, Computer Vision, Kernels, Statistical Learning Theory, Nystrom</p>
</li>
<li>
<h3 class="title">Implicit Kernel Meta-learning using Kernel Integral Forms</h3> <span class="authors">John Isak Texas Falk, Carlo Ciliberto</span> <span class="venue type relevance addmsg" type="conference" relevance="3" alert="ORAL">UAI</span> <span class="date">2022</span>
<p class="abstract">Meta-learning algorithms have made significant progress in the context of meta-learning for image classification but less attention has been given to the regression setting. In this paper we propose to learn the probability distribution representing a random feature kernel that we wish to use within kernel ridge regression (KRR). We introduce two instances of this meta-learning framework, learning a neural network pushforward for a translation-invariant kernel and an affine pushforward for a neural network random feature kernel, both mapping from a Gaussian latent distribution. We learn the parameters of the pushforward by minimizing a meta-loss associated to the KRR objective. Since the resulting kernel does not admit an analytical form, we adopt a random feature sampling approach to approximate it. We call the resulting method Implicit Kernel Meta-Learning (IKML). We derive a meta-learning bound for IKML, which shows the role played by the number of tasks , the task sample size , and the number of random features . In particular the bound implies that can be the chosen independently of and only mildly dependent on . We introduce one synthetic and two real-world meta-learning regression benchmark datasets. Experiments on these datasets show that IKML performs best or close to best when compared against competitive meta-learning methods.</p>
<a class="link-pdf" href="https://openreview.net/pdf?id=rNgqwPUsqgq">pdf</a>
<a class="link-code" href="">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" alt="" src="">
<pre class="bibtex">@article{falk2022implicit,
title={Incremental Learning-to-Learn with Statistical Guarantees},
author={Falk, Isak and Ciliberto, Carlo and Pontil, Massimiliano},
journal={Uncertainty in Artificial Intelligence (UAI)},
year={2022}
}
</pre>
<p class="tags">Meta-learning, Kernels, Statistical Learning Theory, Implicit Kernel Learning</p>
</li>
<li>
<h3 class="title">Modular Adaptive Policy Selection for Multi-Task Imitation Learning through Task Division</h3> <span class="authors">Dafni Antotsiou, Carlo Ciliberto, Tae-Kyun Kim</span> <span class="venue type relevance addmsg" type="conference" relevance="3" alert="ORAL">ICRA</span> <span class="date">2022</span>
<p class="abstract">Deep imitation learning requires many expert demonstrations, which can be hard to obtain, especially when many tasks are involved. However, different tasks often share similarities, so learning them jointly can greatly benefit them and alleviate the need for many demonstrations. But, joint multi-task learning often suffers from negative transfer, sharing information that should be task-specific. In this work, we introduce a method to perform multi-task imitation while allowing for task-specific features. This is done by using proto-policies as modules to divide the tasks into simple sub-behaviours that can be shared. The proto-policies operate in parallel and are adaptively chosen by a selector mechanism that is jointly trained with the modules. Experiments on different sets of tasks show that our method improves upon the accuracy of single agents, task-conditioned and multi-headed multi-task agents, as well as state-of-the-art meta learning agents. We also demonstrate its ability to autonomously divide the tasks into both shared and task-specific sub-behaviours.</p>
<a class="link-pdf" href="https://arxiv.org/pdf/2203.14855.pdf">pdf</a>
<a class="link-code" href="">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" alt="" src="">
<pre class="bibtex">@inproceedings{antotsiou2022modular,
title={Modular Adaptive Policy Selection for Multi-Task Imitation Learning through Task Division},
author={Antotsiou, Dafni and Ciliberto, Carlo and Kim, Tae-Kyun},
booktitle={2022 IEEE International Conference on Robotics and Automation (ICRA)},
year={2022},
organization={IEEE}
}
</pre>
<p class="tags">Meta-learning, Kernels, Statistical Learning Theory, Implicit Kernel Learning</p>
</li>
<li>
<h3 class="title">The Role of Global Labels in Few-Shot Classification and How to Infer Them</h3> <span class="authors">Ruohan Wang, Massimiliano Pontil, Carlo Ciliberto</span> <span class="venue type relevance addmsg" type="conference" relevance="3" alert="ORAL">NeurIPS</span> <span class="date">2021</span>
<p class="abstract">Few-shot learning is a central problem in meta-learning, where learners must quickly adapt to new tasks given limited training data. Recently, feature pre-training has become a ubiquitous component in state-of-the-art meta-learning methods and is shown to provide significant performance improvement. However, there is limited theoretical understanding of the connection between pre-training and meta-learning. Further, pre-training requires global labels shared across tasks, which may be unavailable in practice. In this paper, we show why exploiting pre-training is theoretically advantageous for meta-learning, and in particular the critical role of global labels. This motivates us to propose Meta Label Learning (MeLa), a novel meta-learning framework that automatically infers global labels to obtains robust few-shot models. Empirically, we demonstrate that MeLa is competitive with existing methods and provide extensive ablation experiments to highlight its key properties.</p>
<a class="link-pdf" href="https://arxiv.org/pdf/2108.04055.pdf">pdf</a>
<a class="link-code" href="">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" alt="" src="papers/role-global-labels21/role-global-labels21.png">
<pre class="bibtex">@article{wang2021role,
title={The Role of Global Labels in Few-Shot Classification and How to Infer Them},
author={Wang, Ruohan and Pontil, Massimiliano and Ciliberto, Carlo},
journal={Advances in Neural Information Processing Systems},
volume={34},
pages={27160--27170},
year={2021}
}
</pre>
<p class="tags">Meta-learning, Transfer Learning, Global Labels, Pre-training</p>
</li>
<li>
<h3 class="title">Psd representations for effective probability models</h3> <span class="authors">Alessandro Rudi, Carlo Ciliberto</span> <span class="venue type relevance addmsg" type="conference" relevance="3" alert="ORAL">NeurIPS</span> <span class="date">2021</span>
<p class="abstract">Finding a good way to model probability densities is key to probabilistic inference. An ideal model should be able to concisely approximate any probability while being also compatible with two main operations: multiplications of two models (product rule) and marginalization with respect to a subset of the random variables (sum rule). In this work, we show that a recently proposed class of positive semi-definite (PSD) models for non-negative functions is particularly suited to this end. In particular, we characterize both approximation and generalization capabilities of PSD models, showing that they enjoy strong theoretical guarantees. Moreover, we show that we can perform efficiently both sum and product rule in closed form via matrix operations, enjoying the same versatility of mixture models. Our results open the way to applications of PSD models to density estimation, decision theory, and inference.</p>
<a class="link-pdf" href="https://arxiv.org/pdf/2106.16116.pdf">pdf</a>
<a class="link-code" href="">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" alt="" src="">
<pre class="bibtex">@article{rudi2021psd,
title={Psd representations for effective probability models},
author={Rudi, Alessandro and Ciliberto, Carlo},
journal={Advances in Neural Information Processing Systems},
volume={34},
pages={19411--19422},
year={2021}
}
</pre>
<p class="tags"> Density Estimation, PSD Models, Postive Definite Models</p>
</li>
<li>
<h3 class="title">Adversarial imitation learning with trajectorial augmentation and correction</h3> <span class="authors">Dafni Antotsiou, Carlo Ciliberto, Tae-Kyun Kim</span> <span class="venue type relevance addmsg" type="conference" relevance="3" alert="ORAL">ICRA</span> <span class="date">2021</span>
<p class="abstract">Deep Imitation Learning requires a large number of expert demonstrations, which are not always easy to obtain, especially for complex tasks. A way to overcome this shortage of labels is through data augmentation. However, this cannot be easily applied to control tasks due to the sequential nature of the problem. In this work, we introduce a novel augmentation method which preserves the success of the augmented trajectories. To achieve this, we introduce a semi-supervised correction network that aims to correct distorted expert actions. To adequately test the abilities of the correction network, we develop an adversarial data augmented imitation architecture to train an imitation agent using synthetic experts. Additionally, we introduce a metric to measure diversity in trajectory datasets. Experiments show that our data augmentation strategy can improve accuracy and convergence time of adversarial imitation while preserving the diversity between the generated and real trajectories.</p>
<a class="link-pdf" href="https://arxiv.org/pdf/2103.13887.pdf">pdf</a>
<a class="link-code" href="">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" alt="" src="">
<pre class="bibtex">@inproceedings{antotsiou2021adversarial,
title={Adversarial imitation learning with trajectorial augmentation and correction},
author={Antotsiou, Dafni and Ciliberto, Carlo and Kim, Tae-Kyun},
booktitle={2021 IEEE International Conference on Robotics and Automation (ICRA)},
pages={4724--4730},
year={2021},
organization={IEEE}
}
</pre>
<p class="tags"> Adversarial Learning, Imitation Learning</p>
</li>
<li>
<h3 class="title">Structured Prediction for CRiSP Inverse Kinematics Learning with Misspecified Robot Models</h3> <span class="authors">Gian Maria Marconi, Raffaello Camoriano, Lorenzo Rosasco, Carlo Ciliberto</span> <span class="venue type relevance addmsg" type="conference" relevance="3" alert="ORAL">IEEE Robotics and Automation Letters (RAL)</span> <span class="date">2021</span>
<p class="abstract">With the recent advances in machine learning, problems that traditionally would require accurate modeling to be solved analytically can now be successfully approached with data-driven strategies. Among these, computing the inverse kinematics of a redundant robot arm poses a significant challenge due to the non-linear structure of the robot, the hard joint constraints and the non-invertible kinematics map. Moreover, most learning algorithms consider a completely data-driven approach, while often useful information on the structure of the robot is available and should be positively exploited. In this work, we present a simple, yet effective, approach for learning the inverse kinematics. We introduce a structured prediction algorithm that combines a data-driven strategy with the model provided by a forward kinematics function – even when this function is misspecified – to accurately solve the problem. The proposed approach ensures that predicted joint configurations are well within the robot’s constraints. We also provide statistical guarantees on the generalization properties of our estimator as well as an empirical evaluation of its performance on trajectory reconstruction tasks.</p>
<a class="link-pdf" href="https://arxiv.org/pdf/2102.12942.pdf">pdf</a>
<a class="link-code" href="">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" alt="" src="">
<pre class="bibtex">@article{marconi2021structured,
title={Structured Prediction for CRiSP Inverse Kinematics Learning with Misspecified Robot Models},
author={Marconi, Gian Maria and Camoriano, Raffaello and Rosasco, Lorenzo and Ciliberto, Carlo},
journal={IEEE Robotics and Automation Letters},
volume={6},
number={3},
pages={5650--5657},
year={2021},
publisher={IEEE}
}
</pre>
<p class="tags"> Structured Prediction, Inverse Kinematics, Robotics, Learning</p>
</li>
<li>
<h3 class="title">Statistical Limits of Supervised Quantum Learning</h3> <span class="authors">Carlo Ciliberto, Andrea Rocchetto, Alessandro Rudi, Leonard Wossnig</span> <span class="venue type relevance addmsg" type="conference" relevance="3" alert="ORAL">Physical Review A</span> <span class="date">2020</span>
<p class="abstract">Within the framework of statistical learning theory it is possible to bound the minimum number of samples required by a learner to reach a target accuracy. We show that if the bound on the accuracy is taken into account, quantum machine learning algorithms for supervised learning—for which statistical guarantees are available—cannot achieve polylogarithmic runtimes in the input dimension. We conclude that, when no further assumptions on the problem are made, quantum machine learning algorithms for supervised learning can have at most polynomial speedups over efficient classical algorithms, even in cases where quantum access to the data is naturally available.</p>
<a class="link-pdf" href="https://arxiv.org/pdf/2001.10477.pdf">pdf</a>
<a class="link-code" href="">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" alt="" src="">
<pre class="bibtex">@article{ciliberto2020statistical,
title={Statistical limits of supervised quantum learning},
author={Ciliberto, Carlo and Rocchetto, Andrea and Rudi, Alessandro and Wossnig, Leonard},
journal={Physical Review A},
volume={102},
number={4},
pages={042414},
year={2020},
publisher={APS}
}
</pre>
<p class="tags"> Quantum Computing, Statistical Learning Theory</p>
</li>
<li>
<h3 class="title">Structured Prediction for Conditional Meta-Learning</h3> <span class="authors">Ruohan Wang, Yiannis Demiris, Carlo Ciliberto</span> <span class="venue type relevance addmsg" type="conference" relevance="3" alert="ORAL">NeurIPS</span> <span class="date">2020</span>
<p class="abstract"> The goal of optimization-based meta-learning is to find a single initialization shared across a distribution of tasks to speed up the process of learning new tasks. Conditional meta-learning seeks task-specific initialization to better capture complex task distributions and improve performance. However, many existing conditional methods are difficult to generalize and lack theoretical guarantees. In this work, we propose a new perspective on conditional meta-learning via structured prediction. We derive task-adaptive structured meta-learning (TASML), a principled framework that yields task-specific objective functions by weighing meta-training data on target tasks. Our non-parametric approach is model-agnostic and can be combined with existing meta-learning methods to achieve conditioning. Empirically, we show that TASML improves the performance of existing meta-learning models, and outperforms the state-of-the-art on benchmark datasets. </p>
<a class="link-pdf" href="https://arxiv.org/pdf/2002.08799.pdf">pdf</a>
<a class="link-code" href="">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" alt="" src="papers/structured-prediction-for-conditional-meta-learning/motivation-alg.png">
<pre class="bibtex">@article{wang2020structured,
title={Structured Prediction for Conditional Meta-learning},
author={Wang, Ruohan and Demiris, Yiannis and Ciliberto, Carlo},
journal={Neural Information Processing Systems (NeurIPS) 2020},
year={2020}
}
</pre>
<p class="tags"> Machine Learning, Structured Prediction, Meta-Learning, Kernel Methods, Learning-to-learn, Lifelong learning</p>
</li>
<li>
<h3 class="title">The Advantage of Conditional Meta-Learning for
Biased Regularization and Fine-Tuning</h3> <span class="authors">Giulia Denevi, Massimiliano Pontil, Carlo Ciliberto</span> <span class="venue type relevance addmsg" type="conference" relevance="3" alert="ORAL">NeurIPS</span> <span class="date">2020</span>
<p class="abstract"> Biased regularization and fine-tuning are two recent meta-learning approaches. They have been shown to be effective to tackle distributions of tasks, in which the tasks’ target vectors are all close to a common meta-parameter vector. However, these methods may perform poorly on heterogeneous environments of tasks, where the complexity of the tasks’ distribution cannot be captured by a single meta-parameter vector. We address this limitation by conditional meta-learning, inferring a conditioning function mapping task’s side information into a meta-parameter vector that is appropriate for that task at hand. We characterize properties of the environment under which the conditional approach brings a substantial advantage over standard meta-learning and we highlight examples of environments, such as those with multiple clusters, satisfying these properties. We then propose a convex meta-algorithm providing a comparable advantage also in practice. Numerical experiments confirm our theoretical findings. </p>
<a class="link-pdf" href="https://arxiv.org/pdf/2008.10857.pdf">pdf</a>
<a class="link-code" href="">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" alt="" src="">
<pre class="bibtex">@article{denevi2020advantage,
title={The Advantage of Conditional Meta-Learning for
Biased Regularization and Fine-Tuning},
author={Denevi, Giulia and Pontil, Massimiliano and Ciliberto, Carlo},
journal={Neural Information Processing Systems (NeurIPS) 2020},
year={2020}
}
</pre>
<p class="tags"> Machine Learning, Online Learning, Meta-learning</p>
</li>
<li>
<h3 class="title">Hyperbolic Manifold Regression</h3> <span class="authors">Gian Maria Marconi, Lorenzo Rosasco, Carlo Ciliberto</span> <span class="venue type relevance addmsg" type="conference" relevance="3" alert="ORAL">AISTATS</span> <span class="date">2020</span>
<p class="abstract"> Geometric representation learning has recently shown great promise in several machine learning settings, ranging from relational learning to language processing and generative models. In this work, we consider the problem of performing manifold-valued regression onto an hyperbolic space as an intermediate component for a number of relevant machine learning applications. In particular, by formulating the problem of predicting nodes of a tree as a manifold regression task in the hyperbolic space, we propose a novel perspective on two challenging tasks: 1) hierarchical classification via label embeddings and 2) taxonomy extension of hyperbolic representations. To address the regression problem we consider previous methods as well as proposing two novel approaches that are computationally more advantageous: a parametric deep learning model that is informed by the geodesics of the target space and a non-parametric kernel-method for which we also prove excess risk bounds. Our experiments show that the strategy of leveraging the hyperbolic geometry is promising. In particular, in the taxonomy expansion setting, we find that the hyperbolic-based estimators significantly outperform methods performing regression in the ambient Euclidean space. </p>
<a class="link-pdf" href="https://arxiv.org/pdf/2005.13885.pdf">pdf</a>
<a class="link-code" href="">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" alt="" src="papers/hyperbolic-manifold-regression/hyperbolic-manifold-regression.png">
<pre class="bibtex">@article{marconi2020hyperbolic,
title={Hyperbolic Manifold Regression},
author={Marconi, Gian Maria and Rosasco, Lorenzo and Ciliberto, Carlo},
journal={Artificial Intelligence and Statistics (AISTATS) 2020},
year={2020}
}
</pre>
<p class="tags"> Machine Learning, Structured Prediction, Hyperbolic Embeddings, Kernel Methods</p>
</li>
<li>
<h3 class="title">A General Framework for Consistent Structured Prediction with Implicit Loss Embeddings</h3> <span class="authors">Carlo Ciliberto, Lorenzo Rosasco, Alessandro Rudi</span> <span class="venue type relevance addmsg" type="conference" relevance="3" alert="ORAL">JMLR</span> <span class="date">2020</span>
<p class="abstract"> We propose and analyze a novel theoretical and algorithmic framework for structured prediction. While so far the term has referred to discrete output spaces, here we consider more general settings, such as manifolds or spaces of probability measures. We define structured prediction as a problem where the output space lacks a vectorial structure. We identify and study a large class of loss functions that implicitly defines a suitable geometry on the problem. The latter is the key to develop an algorithmic framework amenable to a sharp statistical analysis and yielding efficient computations. When dealing with output spaces with infinite cardinality, a suitable implicit formulation of the estimator is shown to be crucial. </p>
<a class="link-pdf" href="https://arxiv.org/pdf/2002.05424.pdf">pdf</a>
<a class="link-code" href="">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" alt="" src="">
<pre class="bibtex">@article{ciliberto2020general,
title={A General Framework for Consistent Structured Prediction with Implicit Loss Embeddings},
author={Ciliberto, Carlo and Rosasco, Lorenzo and Rudi, Alessandro},
journal={Journal of Machine Learning Research (JMLR)},
year={2020}
}
</pre>
<p class="tags"> Machine Learning, Structured Prediction, Statistical Learning Theory</p>
</li>
<li>
<h3 class="title">Sinkhorn Barycenters with Free Support via Frank-Wolfe Algorithm</h3> <span class="authors">Giulia Luise, Saverio Salzo, Massimiliano Pontil, Carlo Ciliberto</span> <span class="venue type relevance addmsg" type="conference" relevance="3" alert="ORAL">NeurIPS</span> <span class="date">2019</span>
<p class="abstract"> We present a novel algorithm to estimate the barycenter of arbitrary probability distributions with respect to the Sinkhorn divergence. Based on a Frank-Wolfe optimization strategy, our approach proceeds by populating the support of the barycenter incrementally, without requiring any pre-allocation. We consider discrete as well as continuous distributions, proving convergence rates of the proposed algorithm in both settings. Key elements of our analysis are a new result showing that the Sinkhorn divergence on compact domains has Lipschitz continuous gradient with respect to the Total Variation and a characterization of the sample complexity of Sinkhorn potentials. Experiments validate the effectiveness of our method in practice. </p>
<a class="link-pdf" href="https://arxiv.org/pdf/1905.13194.pdf">pdf</a>
<a class="link-code" href="https://github.com/GiulsLu/Sinkhorn-Barycenters">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" alt="" src="papers/sinkhorn-barycenters-frank-wolfe/sinkhorn-barycenters-frank-wolfe.png">
<pre class="bibtex">@inproceedings{luise2019sinkhorn,
title={Sinkhorn Barycenters with Free Support via Frank-Wolfe Algorithm},
author={Luise, Giulia and Salzo, Saverio and Pontil, Massimiliano and Ciliberto, Carlo},
booktitle={Advances in Neural Information Processing Systems},
pages={9318--9329},
year={2019}
}
</pre>
<p class="tags"> Machine Learning, Optimal Transport, Frank-Wolfe, Barycenters, Sinkhorn</p>
</li>
<li>
<h3 class="title">Localized Structured Prediction</h3> <span class="authors">Carlo Ciliberto, Francis Bach, Alessandro Rudi</span> <span class="venue type relevance addmsg" type="conference" relevance="3" alert="ORAL">NeurIPS</span> <span class="date">2019</span>
<p class="abstract"> Key to structured prediction is exploiting the problem structure to simplify the learning process. A major challenge arises when data exhibit a local structure (e.g., are made by "parts") that can be leveraged to better approximate the relation between (parts of) the input and (parts of) the output. Recent literature on signal processing, and in particular computer vision, has shown that capturing these aspects is indeed essential to achieve state-of-the-art performance. While such algorithms are typically derived on a case-by-case basis, in this work we propose the first theoretical framework to deal with part-based data from a general perspective. We derive a novel approach to deal with these problems and study its generalization properties within the setting of statistical learning theory. Our analysis is novel in that it explicitly quantifies the benefits of leveraging the part-based structure of the problem with respect to the learning rates of the proposed estimator. </p>
<a class="link-pdf" href="https://arxiv.org/pdf/1806.02402.pdf">pdf</a>
<a class="link-code" href="">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" alt="" src="papers/localized-structured-prediction/localized-structured-prediction.png">
<pre class="bibtex">@inproceedings{ciliberto2019localized,
title={Localized structured prediction},
author={Ciliberto, Carlo and Bach, Francis and Rudi, Alessandro},
booktitle={Advances in Neural Information Processing Systems},
pages={7299--7309},
year={2019}
}
</pre>
<p class="tags"> Machine Learning, Structured Prediction, Statistical Learning Theory</p>
</li>
<li>
<h3 class="title">Online-Within-Online Meta-Learning</h3> <span class="authors">Giulia Denevi, Dimitris Stamos, Carlo Ciliberto, Massimiliano Pontil</span> <span class="venue type relevance addmsg" type="conference" relevance="3" alert="ORAL">NeurIPS</span> <span class="date">2019</span>
<p class="abstract"> We study the problem of learning a series of tasks in a fully online Meta-Learning setting. The goal is to exploit similarities among the tasks to incrementally adapt an inner online algorithm in order to incur a low averaged cumulative error over the tasks. We focus on a family of inner algorithms based on a parametrized variant of online Mirror Descent. The inner algorithm is incrementally adapted by an online Mirror Descent meta-algorithm using the corresponding within-task minimum regularized empirical risk as the meta-loss. In order to keep the process fully online, we approximate the meta-subgradients by the online inner algorithm. An upper bound on the approximation error allows us to derive a cumulative error bound for the proposed method. Our analysis can also be converted to the statistical setting by online-to-batch arguments. We instantiate two examples of the framework in which the meta-parameter is either a common bias vector or feature map. Finally, preliminary numerical experiments confirm our theoretical findings. </p>
<a class="link-pdf" href="https://papers.nips.cc/paper/9468-online-within-online-meta-learning.pdf">pdf</a>
<a class="link-code" href="https://github.com/dstamos/LR-SELF">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" alt="" src="">
<pre class="bibtex">@incollection{denevi2019online,
title = {Online-Within-Online Meta-Learning},
author = {Denevi, Giulia and Stamos, Dimitris and Ciliberto, Carlo and Pontil, Massimiliano},
booktitle = {Advances in Neural Information Processing Systems 32},
pages = {13110--13120},
year = {2019},
publisher = {Curran Associates, Inc.},
url = {http://papers.nips.cc/paper/9468-online-within-online-meta-learning.pdf}
}
</pre>
<p class="tags"> Machine Learning, Learning-to-Learn, Metalearning, Hyperparameter optimization</p>
</li>
<li>
<h3 class="title">Leveraging Low-Rank Relations Between Surrogate Tasks in Structured Prediction</h3> <span class="authors">Giulia Luise, Dimitris Stamos, Massimiliano Pontil, Carlo Ciliberto</span> <span class="venue type relevance addmsg" type="conference" relevance="3" alert="ORAL">ICML</span> <span class="date">2019</span>
<p class="abstract"> We study the interplay between surrogate methods for structured prediction and techniques from multitask learning designed to leverage relationships between surrogate outputs. We propose an efficient algorithm based on trace norm regularization which, differently from previous methods, does not require explicit knowledge of the coding/decoding functions of the surrogate framework. As a result, our algorithm can be applied to the broad class of problems in which the surrogate space is large or even infinite dimensional. We study excess risk bounds for trace norm regularized structured prediction, implying the consistency and learning rates for our estimator. We also identify relevant regimes in which our approach can enjoy better generalization performance than previous methods. Numerical experiments on ranking problems indicate that enforcing low-rank relations among surrogate outputs may indeed provide a significant advantage in practice. </p>
<a class="link-pdf" href="https://arxiv.org/pdf/1903.00667.pdf">pdf</a>
<a class="link-code" href="https://github.com/dstamos/LR-SELF">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" alt="" src="">
<pre class="bibtex">@inproceedings{luise2019leveraging,
title={Leveraging low-rank relations between surrogate tasks in structured prediction},
author={Luise, Giulia and Stamos, Dimitris and Pontil, Massimiliano and Ciliberto, Carlo},
booktitle={International Conference on Machine Learning},
year={2019}
}
</pre>
<p class="tags"> Machine Learning, Learning-to-Learn, Metalearning, Hyperparameter optimization</p>
</li>
<li>
<h3 class="title">Learning-to-Learn Stochastic Gradient Descent with Biased Regularization</h3> <span class="authors">Giulia Denevi, Carlo Ciliberto, Riccardo Grazzi, Massimiliano Pontil</span> <span class="venue type relevance addmsg" type="conference" relevance="3" alert="ORAL">ICML</span> <span class="date">2019</span>
<p class="abstract"> We study the problem of learning-to-learn: inferring a learning algorithm that works well on tasks sampled from an unknown distribution. As class of algorithms we consider Stochastic Gradient Descent on the true risk regularized by the square euclidean distance to a bias vector. We present an average excess risk bound for such a learning algorithm. This result quantifies the potential benefit of using a bias vector with respect to the unbiased case. We then address the problem of estimating the bias from a sequence of tasks. We propose a meta-algorithm which incrementally updates the bias, as new tasks are observed. The low space and time complexity of this approach makes it appealing in practice. We provide guarantees on the learning ability of the meta-algorithm. A key feature of our results is that, when the number of tasks grows and their variance is relatively small, our learning-to-learn approach has a significant advantage over learning each task in isolation by Stochastic Gradient Descent without a bias term. We report on numerical experiments which demonstrate the effectiveness of our approach. </p>
<a class="link-pdf" href="https://arxiv.org/pdf/1903.10399.pdf">pdf</a>
<a class="link-code" href="https://github.com/prolearner/onlineLTL">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" alt="" src="">
<pre class="bibtex">@inproceedings{denevi2019learning,
title={Learning-to-Learn Stochastic Gradient Descent with Biased Regularization},
author={Denevi, Giulia and Ciliberto, Carlo and Grazzi, Riccardo and Pontil, Massimiliano},
booktitle={International Conference on Machine Learning},
year={2019}
}
</pre>
<p class="tags"> Machine Learning, Learning-to-Learn, Metalearning, Hyperparameter optimization</p>
</li>
<li>
<h3 class="title">Random Expert Distillation: Imitation Learning via Expert Policy Support Estimation</h3> <span class="authors">Ruohan Wang, Carlo Ciliberto, Pierluigi Amadori, Yiannis Demiris</span> <span class="venue type relevance addmsg" type="conference" relevance="3" alert="ORAL">ICML</span> <span class="date">2019</span>
<p class="abstract"> We consider the problem of imitation learning from a finite set of expert trajectories, without access to reinforcement signals. The classical approach of extracting the expert's reward function via inverse reinforcement learning, followed by reinforcement learning is indirect and may be computationally expensive. Recent generative adversarial methods based on matching the policy distribution between the expert and the agent could be unstable during training. We propose a new framework for imitation learning by estimating the support of the expert policy to compute a fixed reward function, which allows us to re-frame imitation learning within the standard reinforcement learning setting. We demonstrate the efficacy of our reward function on both discrete and continuous domains, achieving comparable or better performance than the state of the art under different reinforcement learning algorithms. </p>
<a class="link-pdf" href="https://arxiv.org/pdf/1905.06750.pdf">pdf</a>
<a class="link-code" href="https://github.com/RuohanW/red">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" alt="" src="">
<pre class="bibtex">@inproceedings{wang2019random,
title={Random Expert Distillation: Imitation Learning via Expert Policy
Support Estimation},
author={Wang, Ruohan and Ciliberto, Carlo and Amadori, Pierluigi and Demiris, Yiannis},
booktitle={International Conference on Machine Learning},
year={2019}
}
</pre>
<p class="tags"> Machine Learning, Reinforcement Learning, Imitation Learning, Support Estimation, MUJOCO</p>
</li>
<li>
<h3 class="title">Differential Properties of Sinkhorn Approximation for Learning with Wasserstein Distance</h3> <span class="authors">Giulia Luise, Alessandro Rudi, Massimiliano Pontil, Carlo Ciliberto</span> <span class="venue type relevance addmsg" type="conference" relevance="3" alert="ORAL">NIPS</span> <span class="date">2018</span>
<p class="abstract"> Applications of optimal transport have recently gained remarkable attention thanks to the computational advantages of entropic regularization. However, in most situations the Sinkhorn approximation of the Wasserstein distance is replaced by a regularized version that is less accurate but easy to differentiate. In this work we characterize the differential properties of the original Sinkhorn distance, proving that it enjoys the same smoothness as its regularized version and we explicitly provide an efficient algorithm to compute its gradient. We show that this result benefits both theory and applications: on one hand, high order smoothness confers statistical guarantees to learning with Wasserstein approximations. On the other hand, the gradient formula allows us to efficiently solve learning and optimization problems in practice. Promising preliminary experiments complement our analysis. </p>
<a class="link-pdf" href="https://arxiv.org/pdf/1805.11897.pdf">pdf</a>
<a class="link-code" href="">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" alt="" src="papers/differential-sinkhorn-wasserstein-learning18/differential-sinkhorn-wasserstein-learning.png">
<pre class="bibtex">@inproceedings{luise2018differential,
title={Differential Properties of Sinkhorn Approximation for Learning with Wasserstein Distance},
author={Luise, Giulia and Rudi, Alessandro and Pontil, Massimiliano and Ciliberto, Carlo},
booktitle={Advances in Neural Information Processing Systems},
year={2018}
}
</pre>
<p class="tags"> Machine Learning, Wasserstein, Optimal Transport, Sinkhorn, Structured Prediction</p>
</li>
<li>
<h3 class="title">Manifold Structured Prediction</h3> <span class="authors">Alessandro Rudi*, Carlo Ciliberto*, Gian Maria Marconi, Lorenzo Rosasco</span> <span class="venue type relevance addmsg" type="conference" relevance="2">NIPS</span> <span class="date">2018</span>
<p class="abstract"> Structured prediction provides a general framework to deal with supervised problems where the outputs have semantically rich structure. While classical approaches consider finite, albeit potentially huge, output spaces, in this paper we discuss how structured prediction can be extended to a continuous scenario. Specifically, we study a structured prediction approach to manifold valued regression. We characterize a class of problems for which the considered approach is statistically consistent and study how geometric optimization can be used to compute the corresponding estimator. Promising experimental results on both simulated and real data complete our study.</p>
<a class="link-pdf" href="https://arxiv.org/pdf/1806.09908.pdf">pdf</a>
<a class="link-code" href="">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" alt="Manifold Structured Prediction. First work investigating how to impose nonlinear relations and constraints among multiple tasks" src="papers/manifold-structured-prediction18/manifold-structured-prediction.png">
<pre class="bibtex">@article{ciliberto2018manifold,
title={Manifold Structured Prediction},
author={Rudi, Alessandro and Ciliberto, Carlo and Gian Maria Marconi and Rosasco, Lorenzo},
journal={Advances in Neural Information Processing Systems},
year={2018}
}
</pre>
<p class="tags">structured-prediction manifold manifold-regression kernel-methods </p>
</li>
<li>
<h3 class="title">Incremental Learning-to-Learn with Statistical Guarantees</h3> <span class="authors">Giulia Denevi, Carlo Ciliberto, Dimitris Stamos, Massimiliano Pontil</span> <span class="venue type relevance addmsg" type="conference" relevance="2">UAI</span> <span class="date">2018</span>
<p class="abstract"> In learning-to-learn the goal is to infer a learning algorithm that works well on a class of tasks sampled from an unknown meta distribution. In contrast to previous work on batch learning-to-learn, we consider a scenario where tasks are presented sequentially and the algorithm needs to adapt incrementally to improve its performance on future tasks. Key to this setting is for the algorithm to rapidly incorporate new observations into the model as they arrive, without keeping them in memory. We focus on the case where the underlying algorithm is ridge regression parameterized by a positive semidefinite matrix. We propose to learn this matrix by applying a stochastic strategy to minimize the empirical error incurred by ridge regression on future tasks sampled from the meta distribution. We study the statistical properties of the proposed algorithm and prove non-asymptotic bounds on its excess transfer risk, that is, the generalization performance on new tasks from the same meta distribution. We compare our online learning-to-learn approach with a state of the art batch method, both theoretically and empirically. </p>
<a class="link-pdf" href="https://arxiv.org/pdf/1803.08089.pdf">pdf</a>
<a class="link-code" href="">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" src="papers/incremental-learning-to-learn-statistical-guarantees18/incremental-learning-to-learn-statistical-guarantees.png">
<pre class="bibtex">@article{denevi2018incremental,
title={Incremental Learning-to-Learn with Statistical Guarantees},
author={Denevi, Giulia and Ciliberto, Carlo and Stamos, Dimitris and Pontil, Massimiliano},
journal={Uncertainty in Artificial Intelligence (UAI)},
year={2018}
}
</pre>
<p class="tags">machine learning, Incremental learning, learning to learn, online learning, transfer learning, multitask learning</p>
</li>
<li>
<h3 class="title">Quantum machine learning: a classical perspective</h3> <span class="authors">Carlo Ciliberto, Mark Herbster, Alessandro Davide Ialongo, Massimiliano Pontil, Andrea Rocchetto, Simone Severini, Leonard Wossnig</span> <span class="venue type relevance addmsg" type="conference" relevance="2">Proceeding of the Royal Society A</span> <span class="date">2018</span>
<p class="abstract"> Recently, increased computational power and data availability, as well as algorithmic advances, have led machine learning (ML) techniques to impressive results in regression, classification, data generation and reinforcement learning tasks. Despite these successes, the proximity to the physical limits of chip fabrication alongside the increasing size of datasets is motivating a growing number of researchers to explore the possibility of harnessing the power of quantum computation to speed up classical ML algorithms. Here we review the literature in quantum ML and discuss perspectives for a mixed readership of classical ML and quantum computation experts. Particular emphasis will be placed on clarifying the limitations of quantum algorithms, how they compare with their best classical counterparts and why quantum resources are expected to provide advantages for learning problems. Learning in the presence of noise and certain computationally hard problems in ML are identified as promising directions for the field. Practical questions, such as how to upload classical data into quantum form, will also be addressed. </p>
<a class="link-pdf" href="https://arxiv.org/pdf/1707.08561.pdf">pdf</a>
<a class="link-code" href="">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" src="">
<pre class="bibtex">@article {ciliberto2018quantum,
author = {Ciliberto, Carlo and Herbster, Mark and Ialongo, Alessandro Davide and Pontil, Massimiliano and Rocchetto, Andrea and Severini, Simone and Wossnig, Leonard},
title = {Quantum machine learning: a classical perspective},
volume = {474},
number = {2209},
year = {2018},
doi = {10.1098/rspa.2017.0551},
publisher = {The Royal Society},
issn = {1364-5021},
URL = {http://rspa.royalsocietypublishing.org/content/474/2209/20170551},
eprint = {http://rspa.royalsocietypublishing.org/content/474/2209/20170551.full.pdf},
journal = {Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences}
}
</pre>
<p class="tags">machine learning, quantum computing, large scale learning</p>
</li>
<li>
<h3 class="title">Consistent Multitask Learning with Nonlinear Output Relations</h3> <span class="authors">Carlo Ciliberto, Alessandro Rudi, Lorenzo Rosasco, Massimiliano Pontil</span> <span class="venue type relevance addmsg" type="conference" relevance="2">NIPS</span> <span class="date">2017</span>
<p class="abstract"> Key to multitask learning is exploiting relationships between different tasks to improve prediction performance. If the relations are linear, regularization approaches can be used successfully. However, in practice assuming the tasks to
be linearly related might be restrictive, and allowing for nonlinear structures is a challenge. In this paper, we tackle this issue by casting the problem within the framework of structured prediction. Our main contribution is a novel
algorithm for learning multiple tasks which are related by a system of nonlinear equations that their joint outputs need to satisfy. We show that the algorithm is consistent and can be efficiently implemented. Experimental results show
the potential of the proposed method. </p>
<a class="link-pdf" href="http://arxiv.org/pdf/1705.08118">pdf</a>
<a class="link-code" href="">code</a>
<a class="link-slides" href="papers/nonlinear-multitask-learning17/slides.pdf">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" alt="Consistent Multitask Learning with Nonlinear Output Relations. First work investigating how to impose nonlinear relations and constraints among multiple tasks" src="papers/nonlinear-multitask-learning17/nonlinear-multitask-learning.png">
<pre class="bibtex">@article{ciliberto2017consistent,
title={Consistent Multitask Learning with Nonlinear Output Relations},
author={Ciliberto, Carlo and Rudi, Alessandro and Rosasco, Lorenzo and Pontil, Massimiliano},
journal={Advances in Neural Information Processing Systems},
year={2017}
}
</pre>
<p class="tags">structured-prediction multitask-learning non-linear-multitask learning kernel-methods </p>
</li>
<li>
<h3 class="title">Visual recognition for humanoid robots</h3> <span class="authors"> Sean Ryan Fanello, Carlo Ciliberto, Nicoletta Noceti, Giorgio Metta, Francesca Odone</span> <span class="venue type relevance addmsg" type="conference" relevance="2">Robotics and Autonomous Systems</span> <span class="date">2017</span>
<p class="abstract">Visual perception is a fundamental component for most robotics systems operating in human environments. Specifically, visual recognition is a prerequisite to a large variety of tasks such as tracking, manipulation, human–robot interaction. As a consequence, the lack of successful recognition often becomes a bottleneck for the application of robotics system to real-world situations. In this paper we aim at improving the robot visual perception capabilities in a natural, human-like fashion, with a very limited amount of constraints to the acquisition scenario. In particular our goal is to build and analyze a learning system that can rapidly be re-trained in order to incorporate new evidence if available. To this purpose, we review the state-of-the-art coding–pooling pipelines for visual recognition and propose two modifications which allow us to improve the quality of the representation, while maintaining real-time performances: a coding scheme, Best Code Entries (BCE), and a new pooling operator, Mid-Level Classification Weights (MLCW). The former focuses entirely on sparsity and improves the stability and computational efficiency of the coding phase, the latter increases the discriminability of the visual representation, and therefore the overall recognition accuracy of the system, by exploiting data supervision. The proposed pipeline is assessed from a qualitative perspective on a Human–Robot Interaction (HRI) application on the iCub platform. Quantitative evaluation of the proposed system is performed both on in-house robotics data-sets (iCubWorld) and on established computer vision benchmarks (Caltech-256, PASCAL VOC 2007). As a byproduct of this work, we provide for the robotics community an implementation of the proposed visual recognition pipeline which can be used as perceptual layer for more complex robotics applications.</p>
<a class="link-pdf" href="papers/visual-recognition-for-robotics17/visual-recognition-for-robotics.pdf">pdf</a>
<a class="link-code" href="">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" alt="icub visually recognizes everyday objects" src="papers/visual-recogniton-for-robotics17/visual-recognition-for-robotics.png">
<pre class="bibtex">@article{fanello2017visual,
title={Visual recognition for humanoid robots},
author={Fanello, Sean Ryan and Ciliberto, Carlo and Noceti, Nicoletta and Metta, Giorgio and Odone, Francesca},
journal={Robotics and Autonomous Systems},
volume={91},
pages={151--168},
year={2017},
publisher={Elsevier}
}
</pre>
<p class="tags">machine learning, vision, robotics </p>
</li>
<li>
<h3 class="title">Low Compute and Fully Parallel Computer Vision with HashMatch</h3> <span class="authors"> Sean Ryan Fanello, Julien Valentin, Adarsh Kowdle, Christoph Rhemann, Vladimir Tankovich, Carlo Ciliberto, Philip Davidson, Shahram Izadi</span> <span class="venue type relevance addmsg" type="conference" relevance="2">ICCV</span> <span class="date">2017</span>
<p class="abstract">Numerous computer vision problems such as stereo depth estimation, object-class segmentation and foreground/ background segmentation can be formulated as per pixel image labeling tasks. Given one or many images as input, the desired output of these methods is usually a spatially smooth assignment of labels. The large amount of such computer vision problems has lead to significant research efforts, with the state of art moving from CRF-based approaches to deep CNNs and more recently, hybrids of the two. Although these approaches have significantly advanced the state of the art, the vast majority has solely focused on improving quantitative results and are not designed for low-compute scenarios. In this paper, we present a new general framework for a variety of computer vision labeling tasks, called HashMatch. Our approach is designed to be both fully parallel, i.e. each pixel is independently processed, and low-compute, with a model complexity an order of magnitude less than existing CNN and CRFbased approaches. We evaluate HashMatch extensively on several problems such as disparity estimation, image retrieval, feature approximation and background subtraction, for which HashMatch achieves high computational efficiency while producing high quality results.</p>
<a class="link-pdf" href="http://openaccess.thecvf.com/content_ICCV_2017/papers/Fanello_Low_Compute_and_ICCV_2017_paper.pdf">pdf</a>
<a class="link-code" href="">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" alt="icub visually recognizes everyday objects" src="papers/hashmatch-low-compute-fully-parallel17/hashmatch.png">
<pre class="bibtex">@inproceedings{fanello2017low,
title={Low Compute and Fully Parallel Computer Vision with HashMatch},
author={Fanello, Sean Ryan and Valentin, Julien and Kowdle, Adarsh and Rhemann, Christoph and Tankovich, Vladimir and Ciliberto, Carlo and Davidson, Philip and Izadi, Shahram},
booktitle={2017 IEEE International Conference on Computer Vision (ICCV)},
pages={3894--3903},
year={2017},
organization={IEEE}
}
</pre>
<p class="tags">machine learning, vision, hashing, markov-random-fields, optimization </p>
</li>
<li>
<h3 class="title">Incremental Robot Learning of New Objects with Fixed Update Time</h3> <span class="authors">Raffaello Camoriano, Giulia Pasquale, Carlo Ciliberto, Lorenzo Natale, Lorenzo Rosasco, Giorgio Metta</span> <span class="venue type relevance addmsg" type="conference" relevance="2">ICRA</span> <span class="date">2017</span>
<p class="abstract">We consider object recognition in the context of lifelong learning, where a robotic agent learns to discriminate between a growing number of object classes as it accumulates experience about the environment. We propose an incremental variant of the Regularized Least Squares for Classification (RLSC) algorithm, and exploit its structure to seamlessly add new classes to the learned model. The presented algorithm addresses the problem of having an unbalanced proportion of training examples per class, which occurs when new objects are presented to the system for the first time.
We evaluate our algorithm on both a machine learning benchmark dataset and two challenging object recognition tasks in a robotic setting. Empirical evidence shows that our approach achieves comparable or higher classification performance than its batch counterpart when classes are unbalanced, while being significantly faster.</p>
<a class="link-pdf" href="https://arxiv.org/pdf/1605.05045.pdf">pdf</a>
<a class="link-code" href="">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" src="papers/incremental-learning17/incremental-learning.png">
<pre class="bibtex">@inproceedings{camoriano2017incremental,
title={Incremental robot learning of new objects with fixed update time},
author={Camoriano, Raffaello and Pasquale, Giulia and Ciliberto, Carlo and Natale, Lorenzo and Rosasco, Lorenzo and Metta, Giorgio},
booktitle={Robotics and Automation (ICRA), 2017 IEEE International Conference on},
pages={3207--3214},
year={2017},
organization={IEEE}
}
</pre>
<p class="tags">machine learning, vision, robotics </p>
</li>
<li>
<h3 class="title">Connecting YARP to the Web with yarp.js</h3> <span class="authors">Carlo Ciliberto</span> <span class="venue type relevance addmsg" type="conference" relevance="3" alert="ORAL">Frontiers in Robotics and AI</span> <span class="date">2017</span>
<p class="abstract"> We present yarp.js, a JavaScript framework enabling robotics networks to interface and interact with external devices by exploiting modern Web communication protocols. By connecting a YARP server module with a browser client on any external device, yarp.js allows to access on board sensors using standard Web APIs and stream the acquired data through the yarp.js network without the need for any installation. Communication between YARP modules and yarp.js clients is bi-directional, opening also the possibility for robotics applications to exploit the capabilities of modern browsers to process external data, such as speech synthesis, 3D data visualization, or video streaming to name a few. Yarp.js requires only a browser installed on the client device, allowing for fast and easy deployment of novel applications. The code and sample applications to get started with the proposed framework are available for the community at the yarp.js GitHub repository. </p>
<a class="link-pdf" href="https://www.frontiersin.org/articles/10.3389/frobt.2017.00067/full">pdf</a>
<a class="link-code" href="https://github.com/robotology/yarp.js">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" alt="Example of Yarp js network: YARP network is connected via WebSockets with devices on which YARP has not been installed." src="papers/yarp-js17/yarp-js.png">
<pre class="bibtex">@article{10.3389/frobt.2017.00067,
author={Ciliberto, Carlo},
title={Connecting YARP to the Web with Yarp.js},
journal={Frontiers in Robotics and AI},
volume={4},
pages={67},
year={2017},
URL={https://www.frontiersin.org/article/10.3389/frobt.2017.00067},
DOI={10.3389/frobt.2017.00067},
ISSN={2296-9144},
}
</pre>
<p class="tags"> YARP, robotics </p>
</li>
<li>
<h3 class="title">A Consistent Regularization Approach for Structured Prediction</h3> <span class="authors">Carlo Ciliberto, Alessandro Rudi, Lorenzo Rosasco</span> <span class="venue type relevance addmsg" type="conference" relevance="3" alert="ORAL">NIPS</span> <span class="date">2016</span>
<p class="abstract"> We propose and analyze a regularization approach for structured prediction problems. We characterize a large class of loss functions that allows to naturally embed structured outputs in a linear space. We exploit this fact to design learning
algorithms using a surrogate loss approach and regularization techniques. We prove universal consistency and finite sample bounds characterizing the generalization properties of the proposed method. Experimental results are provided
to demonstrate the practical usefulness of the proposed approach. </p>
<a class="link-pdf" href="http://arxiv.org/pdf/1605.07588">pdf</a>
<a class="link-code" href="">code</a>
<a class="link-slides" href="papers/sp-nips16/slides.pdf">slides</a>
<a class="link-video" href="https://www.youtube.com/watch?v=gjk0N5Qltfg&t=71s">video</a>
<img class="picture" alt="A Consistent Regularization Approach for Structured Prediction. First work proposing a structured prediction framework and algorithm for which it is possible to prove universal consistency and non asymptotic learning rates (generalization bounds)" src="papers/consistent-struct-pred16/consistent-struct-pred.png">
<pre class="bibtex">@inproceedings{ciliberto2016consistent,
title={A Consistent Regularization Approach for Structured Prediction},
author={Ciliberto, Carlo and Rosasco, Lorenzo and Rudi, Alessandro},
booktitle={Advances in Neural Information Processing Systems},
pages={4412--4420},
year={2016}
}
</pre>
<p class="tags"> machine learning, robotics, incremental learning</p>
</li>
<li>
<h3 class="title">Combining sensory modalities and exploratory procedures to improve haptic object recognition in robotics</h3> <span class="authors">Bertrand Higy, Carlo Ciliberto, Lorenzo Rosasco, Lorenzo Natale</span> <span class="venue type relevance addmsg" type="conference" relevance="3" alert="ORAL">Humanoids</span> <span class="date">2016</span>
<p class="abstract"> In this paper we tackle the problem of object
recognition using haptic feedback from a robot holding and
manipulating different objects. One of the main challenges
in this setting is to understand the role of different sensory
modalities (namely proprioception, object weight from F/T
sensors and touch) and how to combine them to correctly
discriminate different objects. We investigated these aspects by considering multiple sensory
channels and different exploratory strategies to gather meaningful
information regarding the object’s physical properties.
We propose a novel strategy to train a learning machine able to
efficiently combine sensory modalities by first learning individual
object features and then combine them in a single classifier.
To evaluate our approach and compare it with previous methods
we collected a dataset for haptic object recognition, comprising
11 objects that were held in the hands of the iCub robot while
performing different exploration strategies. Results show that
our strategy consistently outperforms previous approaches.
</p>
<a class="link-pdf" href="https://www.researchgate.net/profile/Bertrand_Higy/publication/312112438_Combining_sensory_modalities_and_exploratory_procedures_to_improve_haptic_object_recognition_in_robotics/links/5a046771a6fdcc1c2f5fafa2/Combining-sensory-modalities-and-exploratory-procedures-to-improve-haptic-object-recognition-in-robotics.pdf">pdf</a>
<a class="link-code" href="">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" src="papers/haptic-bert/pic.svg">
<pre class="bibtex">@inproceedings{higy2016combining,
title={Combining sensory modalities and exploratory procedures to improve haptic object recognition in robotics},
author={Higy, Bertrand and Ciliberto, Carlo and Rosasco, Lorenzo and Natale, Lorenzo},
booktitle={Humanoid Robots (Humanoids), 2016 IEEE-RAS 16th International Conference on},
pages={117--124},
year={2016},
organization={IEEE}
}
</pre>
<p class="tags"> machine learning, robotics, haptic recognition</p>
</li>
<li>
<h3 class="title">Active perception: Building objects' models using tactile exploration</h3> <span class="authors">Nawid Jamali, Carlo Ciliberto, Lorenzo Rosasco, Lorenzo Natale</span> <span class="venue type relevance addmsg" type="conference" relevance="3" alert="ORAL">Humanoids</span> <span class="date">2016</span>
<p class="abstract"> In this paper we present an efficient active learning strategy applied to the problem of tactile exploration of an object's surface. The method uses Gaussian process (GPs) classification to efficiently sample the surface of the object in order to reconstruct its shape. The proposed method iteratively samples the surface of the object, while, simultaneously constructing a probabilistic model of the object's surface. The probabilities in the model are used to guide the exploration. At each iteration, the estimate of the object's shape is used to slice the object in equally spaced intervals along the height of the object. The sampled locations are then labelled according to the interval in which their height falls. In its simple form, the data are labelled as belonging to the object and not belonging to the object: object and no-object, respectively. A GP classifier is trained to learn the object/no-object decision boundary. The next location to be sampled is selected at the classification boundary, in this way, the exploration is biased towards more informative areas. Complex features of the object's surface is captured by increasing the number of intervals as the number of sampled locations is increased. We validated our approach on six objects of different shapes using the iCub humanoid robot. Our experiments show that the method outperforms random selection and previous work based on GP regression by sampling more points on and near-the-boundary of the object.
</p>
<a class="link-pdf" href="https://ieeexplore.ieee.org/document/7803275/">pdf</a>
<a class="link-code" href="">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="https://www.youtube.com/watch?v=GQ7h0g35Kp4">video</a>
<img class="picture" src="papers/haptic-jamali/pic.svg">
<pre class="bibtex">@inproceedings{jamali2016active,
title={Active perception: Building objects' models using tactile exploration},
author={Jamali, Nawid and Ciliberto, Carlo and Rosasco, Lorenzo and Natale, Lorenzo},
booktitle={Humanoid Robots (Humanoids), 2016 IEEE-RAS 16th International Conference on},
pages={179--185},
year={2016},
organization={IEEE}
}
</pre>
<p class="tags"> machine learning, robotics, haptic recognition, active learning</p>
</li>
<li>
<h3 class="title">
Object identification from few examples by improving the invariance of a deep convolutional neural network</h3> <span class="authors">Giulia Pasquale, Carlo Ciliberto, Lorenzo Rosasco, Lorenzo Natale</span> <span class="venue type relevance addmsg" type="conference" relevance="3" alert="ORAL">IROS</span> <span class="date">2016</span>
<p class="abstract">
The development of reliable and robust visual recognition systems is a main challenge towards the deployment of autonomous robotic agents in unconstrained environments. Learning to recognize objects requires image representations that are discriminative to relevant information while being invariant to nuisances, such as scaling, rotations, light and background changes, and so forth. Deep Convolutional Neural Networks can learn such representations from large web-collected image datasets and a natural question is how these systems can be best adapted to the robotics context where little supervision is often available. In this work, we investigate different training strategies for deep architectures on a new dataset collected in a real-world robotic setting. In particular we show how deep networks can be tuned to improve invariance and discriminability properties and perform object identification tasks with minimal supervision.
</p>
<a class="link-pdf" href="https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7759720">pdf</a>
<a class="link-code" href="">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="https://www.youtube.com/watch?v=ghUFweqm7W8&t=37s">video</a>
<img class="picture" src="papers/haptic-jamali/pic.svg">
<pre class="bibtex">@inproceedings{pasquale2016object,
title={Object identification from few examples by improving the invariance of a deep convolutional neural network},
author={Pasquale, Giulia and Ciliberto, Carlo and Rosasco, Lorenzo and Natale, Lorenzo},
booktitle={Intelligent Robots and Systems (IROS), 2016 IEEE/RSJ International Conference on},
pages={4904--4911},
year={2016},
organization={IEEE}
}
</pre>
<p class="tags"> machine learning, robotics, object recognition, representation learning, deep learning</p>
</li>
<li>
<h3 class="title">
Enabling Depth-driven Visual Attention on the iCub Humanoid Robot: Instructions for Use and New Perspectives
</h3> <span class="authors">Giulia Pasquale, Tanis Mar, Carlo Ciliberto, Lorenzo Rosasco, Lorenzo Natale</span> <span class="venue type relevance addmsg" type="conference" relevance="3" alert="ORAL">Frontiers in Robotics and AI</span> <span class="date">2016</span>
<p class="abstract">
Reliable depth perception eases and enables a large variety of attentional and interactive behaviors on humanoid robots. However, the use of depth in real-world scenarios is hindered by the difficulty of computing real-time and robust binocular disparity maps from moving stereo cameras. On the iCub humanoid robot, we recently adopted the Efficient Large-scale Stereo (ELAS) Matching algorithm (Geiger et al., 2010) for computation of the disparity map. In this technical report, we show that this algorithm allows reliable depth perception and experimental evidence that demonstrates that it can be used to solve challenging visual tasks in real-world indoor settings. As a case study, we consider the common situation where the robot is asked to focus the attention on one object close in the scene, showing how a simple but effective disparity-based segmentation solves the problem in this case. This example paves the way to a variety of other similar applications.
</p>
<a class="link-pdf" href="https://arxiv.org/pdf/1509.06939.pdf">pdf</a>
<a class="link-code" href="">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" src="papers/haptic-jamali/pic.svg">
<pre class="bibtex">@article{10.3389/frobt.2016.00035,
author={Pasquale, Giulia and Mar, Tanis and Ciliberto, Carlo and Rosasco, Lorenzo and Natale, Lorenzo},
title={Enabling Depth-Driven Visual Attention on the iCub Humanoid Robot: Instructions for Use and New Perspectives},
journal={Frontiers in Robotics and AI},
volume={3},
pages={35},
year={2016},
URL={https://www.frontiersin.org/article/10.3389/frobt.2016.00035},
DOI={10.3389/frobt.2016.00035},
ISSN={2296-9144},
}
</pre>
<p class="tags"> computer vision, robotics, depth estimation</p>
</li>
<li>
<h3 class="title">
Convex Learning of Multiple Tasks and their Structure</h3> <span class="authors">Carlo Ciliberto, Youssef Mroueh, Tomaso Poggio, Lorenzo Rosasco</span> <span class="venue type relevance addmsg" type="conference" relevance="3" alert="ORAL">ICML</span> <span class="date">2015</span>
<p class="abstract">
Reducing the amount of human supervision is a key problem in machine learning and a natural approach is that of exploiting the relations (structure) among different tasks. This is the idea at the core of multi-task learning. In this context a fundamental question is how to incorporate the tasks structure in the learning problem.We tackle this question by studying a general computational framework that allows to encode a-priori knowledge of the tasks structure in the form of a convex penalty; in this setting a variety of previously proposed methods can be recovered as special cases, including linear and non-linear approaches. Within this framework, we show that tasks and their structure can be efficiently learned considering a convex optimization problem that can be approached by means of block coordinate methods such as alternating minimization and for which we prove convergence to the global minimum.
</p>
<a class="link-pdf" href="https://arxiv.org/pdf/1504.03101.pdf">pdf</a>
<a class="link-code" href="https://github.com/cciliber/matMTL">code</a>
<a class="link-slides" href="papers/convex-multitask-learning15/unifying-framework-convex-multitask-learning.pdf">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" alt="Unifying framework for multitask learning and their structure of relations." src="papers/convex-multitask-learning15/convex-multitask-learning.png">
<pre class="bibtex">@inproceedings{ciliberto2015convex,
title={Convex learning of multiple tasks and their structure},
author={Ciliberto, Carlo and Mroueh, Youssef and Poggio, Tomaso and Rosasco, Lorenzo},
booktitle={International Conference on Machine Learning},
pages={1548--1557},
year={2015}
}
</pre>
<p class="tags"> machine learning, multitask learning, convex optimization, kernels, kernel methods</p>
</li>
<li>
<h3 class="title">
Learning Multiple Visual Tasks while Discovering their Structure</h3> <span class="authors">Carlo Ciliberto, Lorenzo Rosasco, Silvia Villa</span> <span class="venue type relevance addmsg" type="conference" relevance="3" alert="ORAL">CVPR</span> <span class="date">2015</span>
<p class="abstract">
Multi-task learning is a natural approach for computer vision applications that require the simultaneous solution of several distinct but related problems, e.g. object detection, classification, tracking of multiple agents, or denoising, to name a few. The key idea is that exploring task relatedness (structure) can lead to improved performances. In this paper, we propose and study a novel sparse, non-parametric approach exploiting the theory of Reproducing Kernel Hilbert Spaces for vector-valued functions. We develop a suitable regularization framework which can be formulated as a convex optimization problem, and is provably solvable using an alternating minimization approach. Empirical tests show that the proposed method compares favorably to state of the art techniques and further allows to recover interpretable structures, a problem of interest in its own right.
</p>
<a class="link-pdf" href="https://arxiv.org/pdf/1504.03106.pdf">pdf</a>
<a class="link-code" href="https://github.com/cciliber/matMTL">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" src="papers/multitask-discover-sparse-structure15/sparse-multitask-learning.png">
<pre class="bibtex">@inproceedings{ciliberto2015learning,
title={Learning multiple visual tasks while discovering their structure},
author={Ciliberto, Carlo and Rosasco, Lorenzo and Villa, Silvia},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
pages={131--139},
year={2015}
}
</pre>
<p class="tags"> machine learning, multitask learning, convex optimization, kernels, kernel methods, computer vision, sparsity, structure learning</p>
</li>
<li>
<h3 class="title">
Characterizing the input-output function of the olfactory-limbic pathway in the guinea pig</h3> <span class="authors">Gian Luca Breschi, Carlo Ciliberto, Thierry Nieus, Lorenzo Rosasco, Stefano Taverna, Michela Chiappalone, Valentina Pasquale</span> <span class="venue type relevance addmsg" type="conference" relevance="3" alert="ORAL">Computational intelligence and neuroscience</span> <span class="date">2015</span>
<p class="abstract">
Nowadays the neuroscientific community is taking more and more advantage of the continuous interaction between engineers and computational neuroscientists in order to develop neuroprostheses aimed at replacing damaged brain areas with artificial devices. To this end, a technological effort is required to develop neural network models which can be fed with the recorded electrophysiological patterns to yield the correct brain stimulation to recover the desired functions. In this paper we present a machine learning approach to derive the input-output function of the olfactory-limbic pathway in the in vitro whole brain of guinea pig, less complex and more controllable than an in vivo system. We first experimentally characterized the neuronal pathway by delivering different sets of electrical stimuli from the lateral olfactory tract (LOT) and by recording the corresponding responses in the lateral entorhinal cortex (l-ERC). As a second step, we used information theory to evaluate how much information output features carry about the input. Finally we used the acquired data to learn the LOT-l-ERC "I/O function," by means of the kernel regularized least squares method, able to predict l-ERC responses on the basis of LOT stimulation features. Our modeling approach can be further exploited for brain prostheses applications.
</p>
<a class="link-pdf" href="https://arxiv.org/pdf/1504.03106.pdf">pdf</a>
<a class="link-code" href="">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" src="papers/sparse-mtl/pic.svg">
<pre class="bibtex">@article{breschi2015characterizing,
title={Characterizing the input-output function of the olfactory-limbic pathway in the guinea pig},
author={Breschi, Gian Luca and Ciliberto, Carlo and Nieus, Thierry and Rosasco, Lorenzo and Taverna, Stefano and Chiappalone, Michela and Pasquale, Valentina},
journal={Computational intelligence and neuroscience},
volume={2015},
pages={60},
year={2015},
publisher={Hindawi Publishing Corp.}
}
</pre>
<p class="tags"> machine learning, multitask learning, convex optimization, kernels, kernel methods, computer vision, sparsity, structure learning</p>
</li>
<li>
<h3 class="title">Exploiting global force torque measurements for local compliance estimation in tactile arrays</h3> <span class="authors">Carlo Ciliberto, Luca Fiorio, Marco Maggiali, Lorenzo Natale, Lorenzo Rosasco, Giorgio Metta, Giulio Sandini, Francesco Nori</span> <span class="venue type relevance addmsg" type="conference" relevance="3" alert="ORAL">IROS</span> <span class="date">2014</span>
<p class="abstract">
In this paper we tackle the problem of estimating the local compliance of tactile arrays exploiting global measurements from a single force and torque sensor. The proposed procedure exploits a transformation matrix (describing the relative position between the local tactile elements and the global force/torque measurements) to define a linear regression problem on the unknown local stiffness. Experiments have been conducted on the foot of the iCub robot, sensorized with a single force/torque sensor and a tactile array of 250 tactile elements (taxels) on the foot sole. Results show that a simple calibration procedure can be employed to estimate the stiffness parameters of virtual springs over a tactile array and to use these model to predict normal forces exerted on the array based only on the tactile feedback. Leveraging on previous works [1] the proposed procedure does not necessarily need a-priori information on the transformation matrix of the taxels which can be directly estimated from available measurements.
</p>
<a class="link-pdf" href="https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=6943124">pdf</a>
<a class="link-code" href="">code</a>
<a class="link-slides" href="">slides</a>
<a class="link-video" href="">video</a>
<img class="picture" src="papers/global-force-torque-measurement-skin-foot14/force-torque-skin-foot-model.png">
<pre class="bibtex">@inproceedings{ciliberto2014exploiting,
title={Exploiting global force torque measurements for local compliance estimation in tactile arrays},
author={Ciliberto, Carlo and Fiorio, Luca and Maggiali, Marco and Natale, Lorenzo and Rosasco, Lorenzo and Metta, Giorgio and Sandini, Giulio and Nori, Francesco},
booktitle={Intelligent Robots and Systems (IROS 2014), 2014 IEEE/RSJ International Conference on},
pages={3994--3999},
year={2014},
organization={IEEE}
}
</pre>
<p class="tags"> machine learning, robotics, artificial skin, calibration, estimation</p>
</li>
<li>
<h3 class="title">Ask the image: supervised pooling to preserve feature locality
</h3> <span class="authors">Sean Ryan Fanello, Nicoletta Noceti, Carlo Ciliberto, Giorgio Metta, Francesca Odone</span> <span class="venue type relevance addmsg" type="conference" relevance="3" alert="ORAL">CVPR</span> <span class="date">2014</span>
<p class="abstract">
In this paper we propose a weighted supervised pooling
method for visual recognition systems. We combine a standard
Spatial Pyramid Representation which is commonly
adopted to encode spatial information, with an appropriate
Feature Space Representation favoring semantic information
in an appropriate feature space. For the latter, we propose