-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathindex.html
601 lines (534 loc) · 42.2 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8" />
<title>Pushpendre Rastogi Homepage</title>
<link rel="stylesheet" href="/res/mystylesheet.css">
<link rel="stylesheet" href="//cdnjs.cloudflare.com/ajax/libs/highlight.js/10.5.0/styles/default.min.css">
<link href="/res/ck/plugins/prism/lib/prism/prism_patched.min.css" rel="stylesheet">
<script src="/res/ck/plugins/prism/lib/prism/prism_patched.min.js"></script>
<script src="//cdnjs.cloudflare.com/ajax/libs/highlight.js/10.5.0/highlight.min.js"></script>
<script>hljs.initHighlightingOnLoad();</script>
<!-- /res/mathjax-es5-v3-2-2/tex-mml-svg.js -->
<script src="/res/mathjax-es5-v3-2-2/tex-mml-chtml.js" type="text/javascript" charset="utf-8"></script>
</head>
<body>
<div class="row">
<div class="column left">
<img src="res/header.png" width="90%" alt="header png"/>
<a class="twitter-timeline"
data-height="725"
data-dnt="true" href="https://twitter.com/Pushpendre89?ref_src=twsrc%5Etfw">Twitter</a>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
<h4>Links</h4>
<p><a href="https://stackexchange.com/users/257045/pushpendre">
<img src="https://stackexchange.com/users/flair/257045.png"
width="208" height="58"
alt="profile for Pushpendre on Stack Exchange"
title="profile for Pushpendre on Stack Exchange" />
</a>
</p>
<p><a href="https://github.com/se4u">github</a></p>
<p><a href="https://www.linkedin.com/in/pushpendre/" >
<img src="https://upload.wikimedia.org/wikipedia/commons/0/01/LinkedIn_Logo.svg" width="100" height="25" alt="Linkedin Logo">
</a>
</p>
<p><a href="https://github.com/pushpendre/pushpendre.github.io/blob/master/index.html">Edit this page</a></p>
<p><a href="https://github.com/pushpendre/pushpendre.github.io/search?q=quaternion" target="_blank">Search this website</a></p>
<p><a href="https://github.com/pushpendre/pushpendre.github.io/deployments/activity_log?environment=github-pages">See deployment status</a></p>
<details><summary>how to edit this page</summary>
<p>make changes to index.html and validate them using "tidy --version ; tidy --warn-proprietary-attributes no -e -q index.html",
this command is also added as a pre-commit hook. Also use "python -m http.server" to check that the site looks okay. For writing raw html go to <a href="res/editor.html">editor.html</a></p>
</details>
</div>
<div class="column right">
<header class="col-span">
<h1 class="title counter-skip">Pushpendre Rastogi</h1>
<h2 class="subtitle counter-skip">pushpendre at gmail</h2>
</header>
<h2>Introduction</h2><p>I joined Google Deepmind as a Research Engineer in 2023 and moved to Bay Area from Seattle for the job. I earned a promotion to Senior Applied Scientist in 2021. I joined Amazon Prime Research to work on Plan Recommendation and Content Optimization in April 2020. I joined the Dialog State Tracking in Amazon Alexa in April 2019. I completed my Ph.D. in Computer Science at <a href="http://www.clsp.jhu.edu">The Center For Language and Speech Processing, Johns Hopkins University</a>. My advisor was <a href="http://www.cs.jhu.edu/~vandurme/">Benjamin Van Durme</a>. I TA'd graduate courses on representation learning and machine learning for three semesters during my Phd studies, and I received the <a href="https://engineering.jhu.edu/excellence-teaching-awards/#tbs_nav_item_1">George Sommerman Graduate Teaching Assistant Award</a> with a cash award of $1000. I have reviewed for Transactions On Signal Processing-19, NEURIPS-19, ICML-19, ICLR-19, EMNLP-19, ACL-19, TPAMI-18, NeurIPS-18, KG4IR-18, EMNLP-18, ACL-18.</p>
<h2>Selected Publications</h2>
<p>See my <a href="https://scholar.google.com/citations?user=nqDASHMAAAAJ">google scholar profile</a> for a complete list of publications.</p>
<ul>
<li>"Improving long distance slot carryover in spoken dialog systems".<a href="http://www.cs.jhu.edu/~tongfei/">Tongfei Chen</a>, Chetan Naik, Hua He, Pushpendre Rastogi, and Lambert Mathias. (2019) <a href="https://arxiv.org/abs/1906.01149">[arxiv]</a> <a style="background-color: rgb(255,255,0); color: rgb(255,0,0)" href="https://sites.google.com/view/nlp4convai/program">[bestpaper]</a></li>
<li>"Scaling Multi-Domain Dialogue State Tracking via Query Reformulation".Pushpendre Rastogi, <a href="https://www.linkedin.com/in/arpit-gupta-77759719/">Arpit Gupta</a>, <a href="http://www.cs.jhu.edu/~tongfei/">Tongfei Chen</a>, and Lambert Mathias. (2019) <a href="https://arxiv.org/abs/1903.05164">[arxiv]</a> <a href="http://www.xuwei.io/2019/03/25/%E3%80%8Ascaling-multi-domain-dialogue-state-tracking-via-query-reformulation%E3%80%8B%E8%AE%BA%E6%96%87%E7%AC%94%E8%AE%B0">[chinese-translation-1]</a> <a href="https://www.facebook.com/pushpendre/posts/2555616624460105">[chinese-translation-2]</a> <a href="https://github.com/alexa/alexa-dataset-contextual-query-rewrite">[dataset]</a>
<!-- style="background-color: rgb(255,255,0); color: rgb(255,0,0)" -->
<a href="https://venturebeat.com/2019/06/06/amazons-ai-rewrites-voice-commands-in-natural-language-to-reduce-false-positives">[press-venturebeat]</a></li>
<li>"A dataset for resolving referring expressions in spoken dialogue via contextual query rewrites (cqr)".Michael Regan, Pushpendre Rastogi, <a href="https://www.linkedin.com/in/arpit-gupta-77759719/">Arpit Gupta</a>, and Lambert Mathias. (2019) <a href="https://arxiv.org/1903.11783">[arxiv]</a> <a href="https://github.com/alexa/alexa-dataset-contextual-query-rewrite">[dataset]</a></li>
<li>"Neural variational entity set expansion for automatically populated knowledge graphs".Pushpendre Rastogi, <a href="https://www.cs.jhu.edu/~apoliak1/">Adam Poliak</a>, <a href="https://www.ams.jhu.edu/~lyzinski/">Vince Lyzinski</a>, and <a href="http://www.cs.jhu.edu/~vandurme/">Benjamin Van Durme</a>. (2018) <a href="/res/nvse.bib">[bib]</a> <a href="https://doi.org/10.1007/s10791-018-9342-1">[doi]</a> <a href="https://github.com/se4u/nvse">[code]</a> <a href="https://youtu.be/sGO_wvuPIzM">[demo]</a><a href="https://github.com/se4u/nvse/blob/master/kg4ir_journal_tex/kg4irjournal.pdf">[pdf]</a></li>
<li>"Efficient, compositional, order-sensitive n-gram embeddings".<a href="https://www.cs.jhu.edu/~apoliak1/">Adam Poliak</a>, Pushpendre Rastogi, M. Patrick Martin, and <a href="http://www.cs.jhu.edu/~vandurme/">Benjamin Van Durme</a>. (2017) <a href="http://www.aclweb.org/anthology/E17-2081">[pdf]</a> <a href="http://www.aclweb.org/anthology/E17-2081.bib">[bib]</a> <a href="https://github.com/azpoliak/eco">[code]</a> <a href="https://www.cs.jhu.edu/~apoliak1/files/posters/ECO--EACL-2017-poster.pdf">[poster]</a></li>
<li>"Vertex nomination on the cold start knowledge graph".Pushpendre Rastogi, <a href="https://www.ams.jhu.edu/~lyzinski/">Vince Lyzinski</a>, and <a href="http://www.cs.jhu.edu/~vandurme/">Benjamin Van Durme</a>. (2017) <a href="/res/kbvntr.pdf">[pdf]</a> <a href="/res/kbvntr.bib">[bib]</a></li>
<li>"Weighting finite-state transductions with neural context".Pushpendre Rastogi, <a href="https://ryancotterell.github.io/">Ryan Cotterell</a>, and <a href="http://www.cs.jhu.edu/~jason/">Jason Eisner</a>. (2016) <a href="http://www.aclweb.org/anthology/N16-1076.bib">[bib]</a> <a href="http://www.aclweb.org/anthology/N16-1076">[pdf]</a> <a href="/res/rastogi2016weighting.slides.pdf">[slides]</a> <a href="https://github.com/se4u/neural_wfst.git">[code]</a></li>
<li>"Efficient implementation of enhanced adaptive simultaneous perturbation algorithms".Pushpendre Rastogi, <a href="https://www.linkedin.com/in/jingyizhu/">Jingyi Zhu</a>, and <a href="http://www.ams.jhu.edu/~spall/Personal/">James Spall</a>. (2016) <a href="/res/rastogi2016efficient.bib">[bib]</a> <a href="https://github.com/se4u/FASPSA">[code]</a> <a href="/res/rastogi2016efficient.pdf">[pdf]</a> <a href="https://github.com/facebookresearch/nevergrad/pull/16">[nevergrad]</a></li>
<li>"Multiview lsa: Representation learning via generalized cca".Pushpendre Rastogi, <a href="http://www.cs.jhu.edu/~vandurme/">Benjamin Van Durme</a>, and <a href="http://www.cs.jhu.edu/~raman/">Raman Arora</a>. (2015) <a href="http://www.aclweb.org/anthology/N15-1058">[pdf]</a> <a href="https://zenodo.org/record/16710">[data]</a> <a href="https://github.com/se4u/mvlsa">[code]</a> <a href="http://www.aclweb.org/anthology/N15-1058.bib">[bib]</a> <a href="/mvlsa/mvlsa_poster.pdf">[poster]</a> <a href="/mvlsa/multiview-lsa-proofs-and-faq.html">[supplementary]</a></li>
<li>"Stationarity condition for fractional sampling filters".Pushpendre Rastogi. (2011) <a href="/res/mtp.pdf">[pdf]</a></li></ul>
<h2>Education</h2><table><tbody>
<tr><td style="text-align: left;"><a href="https://courses.edx.org/certificates/ce8af62f77ac47849920055856b1b15c">Introduction to Biology</a></td><td>MITx at EDX</td><td>2021</td><td>Pass</td></tr>
<tr><td style="text-align: left;">Ph.D. and M.S. in Computer Science</td><td>Johns Hopkins University</td><td>2013-19</td><td>3.75/4.0</td></tr>
<tr><td colspan="4" style="text-align: left;">Thesis Topic: Representation Learning for Words and Entities. I presented new methods for unsupervised learning of word and entity embeddings from texts and knowledge bases.<br />Courses and Grades: Natural Language Processing (A), Machine Learning in Complex Domains (A), Stochastic Search & Optimization (B), Parallel Programming (A-), Principles of Programming Languages (A-), Combinatorial Optimization (A+), Introduction to Convexity (A-)</td></tr>
<tr><td style="text-align: left;">M.Tech. in Information and Communication Technology</td><td>IIT Delhi</td><td>2010-11</td><td>8.77/10</td></tr>
<tr><td style="text-align: left;">B.Tech. in Electrical Engg.</td><td>IIT Delhi</td><td>2006-10</td><td>8.86/10</td></tr>
</tbody></table>
<h2>Code Snippets</h2>
<!-- <details>
<summary> X </summary>
<p><pre><code> Y </code></pre>
</details> -->
<details>
<summary> Quick and dirty experiment tracker <a href="posts/quick-and-dirty-experiment-tracker.html">(open in new page)</a></summary>
<iframe src="posts/quick-and-dirty-experiment-tracker.html" loading="lazy" onload='javascript:(function(o){o.style.height=Math.min(600, o.contentWindow.document.body.scrollHeight)+"px";}(this));' style="height:200px;width:100%;border:none;overflow:hidden;"></iframe>
</details>
<details>
<summary> Problem dependent reparameterization of a knapsack problem for asymptotic efficiency <a href="posts/problem-dependent-reparameterization-knapsack-asymptotic-efficiency.html">(open in new page)</a></summary>
<iframe src="posts/problem-dependent-reparameterization-knapsack-asymptotic-efficiency.html" loading="lazy" onload='javascript:(function(o){o.style.height=Math.min(600, o.contentWindow.document.body.scrollHeight)+"px";}(this));' style="height:200px;width:100%;border:none;overflow:hidden;"></iframe>
</details>
<details>
<summary> Parsing using a grammar (abstraction + latent variables) vs actions on a stack <a href="posts/parsing-grammar-vs-stack.html">(open in new page)</a></summary>
<iframe src="posts/parsing-grammar-vs-stack.html" loading="lazy" onload='javascript:(function(o){o.style.height=Math.min(600, o.contentWindow.document.body.scrollHeight)+"px";}(this));' style="height:200px;width:100%;border:none;overflow:hidden;"></iframe>
</details>
<details>
<summary> Some Graph Algorithms <a href="posts/some-graph-algorithms.html">(open in new page)</a></summary>
<iframe src="posts/some-graph-algorithms.html" loading="lazy" onload='javascript:(function(o){o.style.height=Math.min(600, o.contentWindow.document.body.scrollHeight)+"px";}(this));' style="height:200px;width:100%;border:none;overflow:hidden;"></iframe>
</details>
<details>
<summary> Bayesian Optimization in AX with constraints <a href="posts/bayesian-optimization-in-ax-with-constraints.html">(open in new page)</a></summary>
<iframe src="posts/bayesian-optimization-in-ax-with-constraints.html" loading="lazy" onload='javascript:(function(o){o.style.height=Math.min(600, o.contentWindow.document.body.scrollHeight)+"px";}(this));' style="height:200px;width:100%;border:none;overflow:hidden;"></iframe>
</details>
<details>
<summary> Simplify a jupyter notebook <a href="posts/simplify-a-jupyter-notebook.html">(open in new page)</a></summary>
<iframe src="posts/simplify-a-jupyter-notebook.html" loading="lazy" onload='javascript:(function(o){o.style.height=Math.min(600, o.contentWindow.document.body.scrollHeight)+"px";}(this));' style="height:200px;width:100%;border:none;overflow:hidden;"></iframe>
</details>
<details>
<summary> Interactive pool for running jobs on the side in python. </summary>
<pre><code> class InteractivePool:
def __init__(self, J):
import time
self.J = J
self.tic = time.time()
def done(self):
return sum(1 for j in self.J if j.done()), len(self.J)
def collect(self,R=None):
R = R or {}
print(len(R))
for i, j in enumerate(self.J):
if i not in R and j.done() and not j.cancelled():
R[i] = j.result()
print(len(R))
return R
def time(self):
import datetime, time as time_module
return str(datetime.timedelta(seconds=time_module.time() - self.tic))
def wait(self, interval=600):
import time
while True:
time.sleep(interval)
if sum(1 for j in self.J if j.done()) == len(self.J):
break
return
from concurrent.futures import ProcessPoolExecutor
sidejob = ProcessPoolExecutor(max_workers=4).submit
P = InteractivePool([sidejob(f, i) for i in range(80)]) </code></pre>
</details>
<details>
<summary> Save spark dataframe to sparse scipy arrays </summary>
<pre><code> from functools import partial
import pyspark.ml as pm
from typing import *
from scipy.sparse import csr_matrix, vstack, lil_matrix, load_npz, save_npz
from pyspark import TaskContext
from tempfile import TemporaryDirectory
from glob import glob
def sparseVectorList_to_CSRMatrix(X: List[pm.linalg.SparseVector]) -> csr_matrix:
""" Convert list of pyspark sparse vectors to a scipy CSR matrix that
a standard sklearn function/lightgbm can consume.
"""
M = lil_matrix((len(X), X[0].size), dtype=np.float)
for i, x in enumerate(X):
I = np.argsort(x.indices)
M.rows[i] = x.indices[I]
M.data[i] = x.values[I]
return M.tocsr(copy=False)
class RowToPredict(NamedTuple):
"This class was created just to facilitate linting and type hinting."
customer_id: str
features: pm.linalg.SparseVector
def save_features_in_spark_as_sparsescipy_to_hdfs(
hdfs_dir,
row_gen: Iterable[RowToPredict]):
C, Flist = [], []
for e in row_gen:
Flist.append(e.features)
C.append(e.customer_id)
F = sparseVectorList_to_CSRMatrix(Flist)
pid = TaskContext().partitionId()
# Ideally I will upload file directly to HDFS, but I don't know how to directly
# write to HDFS. hdfscli didn't work for me. So work-around is to save to local
# file on task node, then upload to HDFS with a subprocess call.
with TemporaryDirectory() as tmpdirname:
print('created temporary directory', tmpdirname)
fname = f'{tmpdirname}/F.{pid}.npz'
cname = f'{tmpdirname}/C.{pid}.pkl'
with open(fname) as fh:
save_npz(fh, F)
with open(cname) as fh:
pickle.dump(C, fh)
subprocess.getstatusoutput('hadoop fs -put {fname} {cname} {hdfs_dir}')
return
# make sure that each partition has a reasonable number of rows so that we don't OOM.
npart = sdf.count() // 10000
sdf.rdd.repartition(npart).foreachPartition(partial(
save_features_in_spark_as_sparsescipy_to_hdfs, '/data/')
# After saving all the parts to hdfs, download the parts, and open them on master node.
subprocess.getstatusoutput('hadoop -copyToLocal /data/ /home/hadoop/')
L = glob('data/*.npz')
F = vstack([load_npz(e) for e in L])
C = [c for e in L for c in pickle.load(open(e))] </code></pre>
</details>
<details>
<summary> How to hash check pip files </summary>
<pre><code> 1. Install virtualenv
1. Create workspace, download package.
1. Go where the requirements file is.
1. Create fresh empty environment and activate it
1. install all requirements
1. generate hashes for all installed packages
1. close the shell and create new one
1. Create fresh empty environment and activate it
1. check that the new requirements file can be installed with
pip install virtualenv pip-tools
python3 -m venv env; source env/bin/activate
pip list > before; pip install -r requirements.txt; pip list > after
pip-compile requirements.txt --generate-hashes # this overwrites the original file.
exit; bash
python3 -m venv env2; source env2/bin/activate
pip install --require-hashes -r requirements.txt
</code></pre>
</details>
<details>
<summary> hdfs file system functionality exposed to python </summary>
<pre><code> def hdfs_exists(path, flag='-e'):
code, output = subprocess.getstatusoutput(f'hadoop fs -test {flag} {path}')
if code != 0:
print(output)
return False
else:
return True
def copyFromLocal(src, dst):
return subprocess.getstatusoutput(f'hadoop fs -copyFromLocal {src} {dst}')
def copyToLocal(src, dst):
return subprocess.getstatusoutput(f'hadoop fs -copyToLocal {src} {dst}') </code></pre>
</details>
<details>
<summary> Spark Setup </summary>
<pre><code> def setup(RUNDATE='2020-07-16', spark_setting_file=None):
""" Construct spark session, setup logger, and read YAML files
from prime-ml-repo in production EMR clusters. After reading yaml files
format the paths with dates.
RUNDATE is a date string like this '2020-05-30'
"""
assert re.match('\d{4}-\d{2}-\d{2}', RUNDATE)
if spark_setting_file is None:
spark_setting_file = StringIO("""scoring:
spark.executor.memory: '20G'
spark.executor.memoryOverhead: '4G'
spark.executor.cores: 4
spark.task.cpus: 1
spark.yarn.am.memory: '2G'
spark.serializer: 'org.apache.spark.serializer.KryoSerializer'
spark.driver.maxResultSize: 0
spark.executor.extraJavaOptions: '-XX:+UseG1GC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps'
spark.kryoserializer.buffer.max: '512M'
spark.cleaner.periodicGC.interval: '5min'
spark.network.timeout: '600s'""")
# Use logger to log everything to file and also to stderr.
logger = logging.getLogger()
logger.setLevel(logging.INFO)
formatter = logging.Formatter(
"%(asctime)s - %(levelname)s - %(message)s", datefmt="%Y-%m-%d %H:%M:%S"
)
fh = logging.FileHandler("/home/hadoop/scoring_log_file.log")
fh.setLevel(logging.INFO)
fh.setFormatter(formatter)
logger.addHandler(fh)
ch = logging.StreamHandler()
ch.setLevel(logging.INFO)
ch.setFormatter(formatter)
logger.addHandler(ch)
logger.info(f"Initialize job parameters. {RUNDATE}")
parameters = {}
logger.info("Initialize spark settings and spark session.")
spark = SparkSession.builder.appName("claire")
for key, value in yaml.safe_load(spark_setting_file)["scoring"].items():
logger.info(f'spark: {key}={value}')
spark = spark.config(key, value)
spark = spark.enableHiveSupport().getOrCreate()
spark.sparkContext.setLogLevel(os.environ.get("SPARK_LOG_LEVEL", "DEBUG"))
try:
spark.sparkContext.setCheckpointDir("hdfs:///checkpoint/spark/")
except Exception as exception:
warnings.warn("Unable to set spark checkpoint directory !")
return spark, parameters, logger </code></pre>
</details>
<details>
<summary> Common model inspections on binary classification test set</summary>
<pre><code> def tabulate(label, proba, **kwargs):
""" Compute common statistics on a binary classification problem
given the true labels and the class probabilities.
"""
assert proba.shape[1] == 2
assert len(label) == len(proba)
R = SimpleNamespace()
R.accuracy = (label == proba.argmax(1)).mean()
R.majority_rule_accuracy = max(1 - label.mean(), label.mean())
R.log_loss = -np.log(np.select([label==0, label==1],
[proba[:, 0], proba[:, 1]])).mean()
fpr, tpr, thresholds = skm.roc_curve(label, proba[:, 1])
R.roc_auc = skm.auc(fpr, tpr)
try:
idx = np.where(fpr < 0.05)[0].max()
R.tp_at_fp_less_than_5_percent = tpr[idx]
R.fp_at_fp_less_than_5_percent = fpr[idx]
R.threshold_at_fp_less_than_5_percent = thresholds[idx]
except ValueError as e:
print(e)
pass
precision, recall, thresholds = skm.precision_recall_curve(label, proba[:, 1])
R.prauc = skm.auc(recall, precision)
R.precision = precision
R.recall = recall
R.thresholds = thresholds
idx = np.where(precision > 0.9)[0].min()
R.smallest_precision_greater_than_90pct = precision[idx]
R.recall_at_precision_90pct = recall[idx]
R.threshold_at_precision_90pct = thresholds[idx]
idx = np.where(precision > 0.5)[0].min()
R.smallest_precision_greater_than_50pct = precision[idx]
R.recall_at_precision_50pct = recall[idx]
R.threshold_at_precision_50pct = thresholds[idx]
R = R.__dict__
R.update(kwargs)
return R </code></pre>
</details>
<details>
<summary> Boilerplate for configuring logger in python </summary>
<pre><code> import logging
def setup_logger(file_path=None):
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
formatter = logging.Formatter(
"%(asctime)s - %(levelname)s - %(message)s", datefmt="%Y-%m-%d %H:%M:%S"
)
if all(not isinstance(e, logging.FileHandler)
for e in logger.handlers):
ch = logging.StreamHandler()
ch.setLevel(logging.INFO)
ch.setFormatter(formatter)
logger.addHandler(ch)
logger.info(f"Initialized logger with StreamHandler")
if file_path and all(not isinstance(e, logging.FileHandler)
for e in logger.handlers):
fh = logging.FileHandler(file_path, 'a')
fh.setLevel(logging.INFO)
fh.setFormatter(formatter)
logger.addHandler(fh)
logger.info(f"Initialized logger with FileHandler({file_path})")
return </code></pre>
</details>
<details>
<summary> Java - Python/Numpy Fast Copy-Free Exchange </summary>
<pre><code> /* Java */
short a = 3; // 2
long b = 5; // 8
float c = (float)7.0; // 4
ByteBuffer bb = ByteBuffer.allocate(14);
bb.order(ByteOrder.LITTLE_ENDIAN);
bb.putShort(a);
pp(bb.position()); // 2
bb.putLong(b);
pp(bb.position()); // 10
bb.putFloat(c);
pp(bb.position()); // 14
bb.position(0); // crucial.
try(RandomAccessFile f = new RandomAccessFile("/tmp/tmp.dat", "rw");
FileChannel fc = (f).getChannel();) {
pp(bb.order().toString());
pp(fc.write(bb));
}
## Python
from mmap import mmap, PROT_READ
import os
import numpy as np
import sys
assert sys.byteorder == 'little'
fd = os.open('/tmp/tmp.dat', os.O_RDONLY)
buf = mmap(fd, 14, prot=PROT_READ)
# L = little endian.
arr1 = np.frombuffer(buf, dtype=np.dtype('int16').newbyteorder('L'), count=1, offset=0)
arr2 = np.frombuffer(buf, dtype=np.dtype('int64').newbyteorder('L'), count=1, offset=2)
arr3 = np.frombuffer(buf, dtype=np.dtype('float32').newbyteorder('L'), count=1, offset=10)
arr1, arr2, arr3 </code></pre>
</details>
<details>
<summary> Java - Simple printing function. </summary>
<pre><code> static void pp(Object format, Object... args) {
System.out.printf(format.toString(), args);
System.out.println();
} </code></pre>
</details>
<details>
<summary> Json encode numpy objects </summary>
<pre><code> class NumpyEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, np.ndarray):
return obj.tolist()
elif type(obj).__module__ == 'numpy':
if type(obj).__name__.startswith('float'):
return float(obj)
elif type(obj).__name__.startswith('int'):
return int(obj)
else:
return bool(obj)
return json.JSONEncoder.default(self, obj) </code></pre>
</details>
<details>
<summary> List busy EMR machines and the total reserved machines </summary>
<pre><code> aws emr list-clusters --region us-east-1 --active > /tmp/tmp1
jq '.Clusters|.[]|.Id' /tmp/tmp1 -r | xargs -n 1 -I % sh -c 'aws emr list-instances --cluster-id % --instance-states RUNNING --region us-east-1' > /tmp/tmp2
jq '.Instances | .[] | .InstanceType' /tmp/tmp2 -r | sort | uniq -c
aws ec2 describe-reserved-instances --region us-east-1 | jq '.ReservedInstances | .[] | [.InstanceType, .InstanceCount, .State]' -c -r | fgrep -v retired | sort </code></pre>
</details>
<details>
<summary> Excel VBA functions for testing significance of binomial A/B tests. Two-sided Z-Test P-value from ratios and counts </summary>
<pre><code> 'Two-sided ZTest Pvalue from counts
Public Function CountsZTest(count1, nob1, count2, nob2)
CountsZTest = RatioZTest(count1 * 1# / nob1, nob1, count2 * 1# / nob2, nob2)
End Function
Public Function RatioZTest(p1, nob1, p2, nob2)
diff = p1 - p2
p_pooled = (p1 * nob1 + p2 * nob2) * 1# / (nob1 + nob2)
nobs_2xhm = 1# / nob1 + 1# / nob2
var1 = p_pooled * (1 - p_pooled) * nobs_2xhm
std_diff = Sqr(var1)
RatioZTest = Application.WorksheetFunction.Norm_S_Dist(-Abs(diff / std_diff), True) * 2
End Function </code></pre>
</details>
<!-- <details>
<summary> XXX </summary>
<pre><code> YYYY </code></pre>
</details>
-->
<h2>Technical Notes</h2>
<table>
<tbody>
<tr><td><details>
<summary>Steps for adding a note</summary>
<p>If you have a PDF then store it in the posts folder. Clone posts/pdftemplate.html and search-replace any mention of note1.pdf from the html. Finally, link that html file here. Otherwise copy posts/template.html and modify it.
<!-- <tr><td> <a href="res/XXXX.html">NOTE XXXX</a>XXXX.</td></tr> -->
</p>
</details></td> </tr>
<tr><td>
<a href="res/offline-eval-learning-bandits.pdf">Note 12</a> Offline evaluation and learning in bandits.
</td></tr>
<tr><td><details><summary>Note 11 - Sparsity in Deep Learning Layers<a href="posts/sparsity-in-deep-learning.html">(open in new page)</a></summary>
<iframe src="posts/sparsity-in-deep-learning.html" loading="lazy" onload='javascript:(function(o){o.style.height=Math.min(600, o.contentWindow.document.body.scrollHeight)+"px";}(this));' style="height:200px;width:100%;border:none;overflow:hidden;"></iframe>
</details></td></tr>
<tr><td> <a href="res/exploration-scavenging.html">Note 10</a> Exploration Scavenging in comparison to other off-policy estimators. </td></tr>
<tr><td> <a href="https://www.youtube.com/watch?v=pKuVUmpYkLk">Note 9</a> Using a PID Controller for controlling the number of servers in a data-center.<a href="https://news.ycombinator.com/item?id=27732236">[YC]</a><a href="https://gist.github.com/pushpendre/359706010c20bc1d18123510749f5da5">[Gist]</a></td></tr>
<tr><td> <a href="res/bankroll_kelly.html">Note 8</a> Using kelly criterion for bankroll management. </td></tr>
<tr><td> <a href="res/mvue-foundations.html">Note 7</a> The foundations for finding MVUE via the use of Rao-Blackwellization.</td></tr>
<tr><td> <a href="res/the-vw-faq.html">Note 6</a> The VW FAQ.</td></tr>
<tr><td> <a href="/ViewerJS/?zoom=page-width#../res/the-basics-of-zeromq.pdf">Note 5</a> The Basics of ZeroMQ.</td></tr>
<tr><td> <a href="res/graphical_summary_of_elements_of_information_theory.html">Note 4</a> A visual summary of the inequalities govening Entropy, Cross Entropy, Joint Entropy, KL Divergence, and Mutual Information including the Data Processing Inequality.</td></tr>
<tr><td> <a href="res/note3.pdf">Note 3</a> [WIP] A visual proof of the UCB algorithm. </td></tr>
<tr><td> <a href="res/note2.mp4">Note 2</a> A video tutorial about the difference between PnL and Cashflow, and how a company can have positive cash flow but still make loss, without raising debt. (you may need to download the video and play with VLC)</td></tr>
<tr><td>
<a href="res/note1.html">Note 1</a>: Describes how the variance of an AB test can be reduced in the special case when we are comparing two policies with the same small-finite action space.
<details>
<summary>हिंदी विवरण</summary>
<p>हम दो विधियों/treatments के बीच में कितना फर्क है ये पता करना चाहते है। साधारण तरीका होगा AB testing/ randomized control trials जिसमें की हम randomly/बेतरतीब तरिके से आधे लोगो को विधि A आवंटित करते है और आधे को विधि B प्रदान करते है । उसके बाद दोनों दल में औसत फर्क का अंतर हम पता करते है । ये सबसे आसान पद्धति हैं और मानलो की इस पद्धति को इस्तेमाल करने पर हमे 10000 लोगो पे परीक्षण करना पड़ेगा ताकि हम 10% का फर्क दोनो दलों के बीच मे पता कर पाए। जो pdf मैंने भेजी है वो एक विशेष स्थिति का विश्लेषण प्रशेष करती है जो की 25% कम sample इस्तेमाल करती है। ofcourse ये कोई नई तकनीक नही है सिर्फ मैने अपनी समझ के लिये लिखी है।</p>
</details>
</td>
</tr>
</tbody>
</table>
<h2>Trivia</h2>
<details><summary>how to add trivia</summary>
<p>First compile a pdf, either in overleaf, or using latexmk, then add all the assets, the .tex and .pdf file to res/trivia folder.
Then add the iframe with src, loading, width, height attributes.</p>
</details>
<!-- <details><summary></summary></details> -->
<details><summary>Poses, quaternions, and the SE(3) Manifold<a href="posts/poses-quaternions-se3-manifold.html">(open in new page)</a></summary>
<iframe src="posts/poses-quaternions-se3-manifold.html" loading="lazy" onload='javascript:(function(o){o.style.height=Math.min(600, o.contentWindow.document.body.scrollHeight)+"px";}(this));' style="height:200px;width:100%;border:none;overflow:hidden;"></iframe>
</details>
<details><summary>The interpretation of nu in nu-SVM<a href="posts/nu-svm-interpretation.html">(open in new page)</a></summary>
<iframe src="posts/nu-svm-interpretation.html" loading="lazy" onload='javascript:(function(o){o.style.height=Math.min(600, o.contentWindow.document.body.scrollHeight)+"px";}(this));' style="height:200px;width:100%;border:none;overflow:hidden;"></iframe>
</details>
<details><summary>Sampling methods for Bayesian Statistics<a href="posts/sampling-methods-for-bayesian-statistics.html">(open in new page)</a></summary>
<iframe src="posts/sampling-methods-for-bayesian-statistics.html" loading="lazy" onload='javascript:(function(o){o.style.height=Math.min(600, o.contentWindow.document.body.scrollHeight)+"px";}(this));' style="height:200px;width:100%;border:none;overflow:hidden;"></iframe>
</details>
<details><summary>Basics of Hamiltonian Dynamics<a href="posts/hamiltonian-dynamics.html">(open in new page)</a></summary>
<iframe src="posts/hamiltonian-dynamics.html" loading="lazy" onload='javascript:(function(o){o.style.height=Math.min(600, o.contentWindow.document.body.scrollHeight)+"px";}(this));' style="height:200px;width:100%;border:none;overflow:hidden;"></iframe>
</details>
<details><summary>High Contrast ML<a href="posts/high-contrast-ml.html">(open in new page)</a></summary>
<iframe src="posts/high-contrast-ml.html" loading="lazy" onload='javascript:(function(o){o.style.height=Math.min(600, o.contentWindow.document.body.scrollHeight)+"px";}(this));' style="height:200px;width:100%;border:none;overflow:hidden;"></iframe>
</details>
<details><summary>Lessons from the Open Pre-Trained Transformer Logbook<a href="posts/lessons-from-opt-logbook.html">(open in new page)</a></summary>
<iframe src="posts/lessons-from-opt-logbook.html" loading="lazy" onload='javascript:(function(o){o.style.height=Math.min(600, o.contentWindow.document.body.scrollHeight)+"px";}(this));' style="height:200px;width:100%;border:none;overflow:hidden;"></iframe>
</details>
<details><summary>The optimization landscape<a href="posts/the-optimization-landscape.html">(open in new page)</a></summary>
<iframe src="posts/the-optimization-landscape.html" loading="lazy" onload='javascript:(function(o){o.style.height=Math.min(600, o.contentWindow.document.body.scrollHeight)+"px";}(this));' style="height:200px;width:100%;border:none;overflow:hidden;"></iframe>
</details>
<details><summary>The distribution of many-to-one functions of random variables</summary>
<p>The formula for change of distribution under one-to-one differentiable maps is well-known. If <span class="math-tex">\( f \)</span> maps <span class="math-tex">\( x \)</span> to <span class="math-tex">\( y \)</span> then <span class="math-tex">\( p_Y(y) = p_X(f^{-1}(y)) abs(det( jac_{f^{-1}}( y )) \)</span>. However what happens in the more general case where let's say that the function is not invertible ? This can happen in situations like Z = X/Y or Z = X + Y etc. In such situations the most clean method is to compute the probability of the event underlying the CDF and then differentiating. Basically</p>
<p><span class="math-tex">\[ \begin{align}p_Z(z) &= \frac{d}{dz} \int_{\mathbf{x}} \mathbb{I}[f(\mathbf{x}) < z] \ p_{\mathbf{X}}(\mathbf{x}) d\mathbf{x} = \frac{d}{dz} \int_{\mathbb{I}[f(\mathbf{x}) < z]} \ p_{\mathbf{X}}(\mathbf{x}) d\mathbf{x}
\end{align} \]</span></p>
<p>Now recall that <span class="math-tex">\( \frac{d}{dx}\int_a^{u(x)} f(t) dt = u'(x) f(u(x)) \)</span> therefore if we can write the acceptable region as a function of <span class="math-tex">\( z \)</span> then we are in business. For example, if <span class="math-tex">\( z = x_2/x_1 \)</span> then the acceptable region is <span class="math-tex">\( \{x_1 > 0, x_2 < zx_1\} \cup \{x_1 < 0, x_2 > zx_1\} \)</span> and the integral is <span class="math-tex">\[ \int_0^\infty x_1 p(x_1, zx_1)dx_1 + \int_{-\infty}^0-x_1p(x_1, zx_1)dx_1 \]</span>where the derivative has been interchanged assuming fubini's theorem applies in this case.</p>
</details>
<details><summary>Calculus Theorems</summary>
<ul>
<li>MVT - a continuous and differentiable function has atleast one point between a, b that achieves the linear slope between these two points.</li>
<li>EVT - continuos functions on compact sets achieve extrema.</li>
<li>Intermediate Value T - If f is a continuous function then the image of an interval is also an interval.</li>
<li>Taylor's Expansion - This is the best polynomial approximation <span class="math-tex">\( f(x) = f(a) + f'(a)(x - a) + \frac{f''(a)}{2!} (x - a)^2 + ... h_k(x) (x - a)^k \)</span>with <span class="math-tex">\( \lim_{x \to a}h(x) = 0 \)</span> for analytic functions. Taylor's theorem also has a multivariate version.</li>
<li>Newton's method for root of f(x) = x - f(x)/f'(x) derived by taylor expansion. Newton's method for optimization therefore involves the inverse of the derivative of the gradient, i.e. the inverse of hessian.</li>
<li>Darboux's theorem - Every function that has an integral, i.e. this function resulted from the differentiation of another function, satisfied the IVT property even if it not continuous. See this page for example of a function with <a href="https://calculus.subwiki.org/wiki/Derivative_of_differentiable_function_need_not_be_continuous">non-continuous derivative </a> and see this page <a href="https://math.stackexchange.com/questions/292275/discontinuous-derivative">for more exotic function</a></li>
<li>Clairaut's theorem for symmetry of second derivatives , i.e. <span style="text-decoration: underline; color: #b96ad9;">when can we interchange differentiation</span> ? The second partial derivatives need to exist and be continuous. </li>
<li>Fubini's theorem, i.e. <span style="text-decoration: underline; color: #b96ad9;">when can we interchange integration</span> ? absolute integrability implies interchange is possible.</li>
<li>Leibniz Integral rule, i.e. <span style="text-decoration: underline;"><span style="color: #e67e23; text-decoration: underline;">when can we interchange differentiation and integration</span></span> ? Let <span class="math-tex">\( f(x,t) \)</span> be a function such that both f and its partial derivative <span class="math-tex">\( f_x \)</span> are continuous and suppose that the limit functions are continuous and have continuous derivatives. Then <br><span class="math-tex">\[ \frac{d}{dx}\Big(\int_{a(x)}^{b(x)} f(x,t) dt\Big) = f(x,b(x)) b'(x) - f(x,a(x))a'(x) + \int_{a(x)}^{b(x)} \frac{\partial f(x,t)}{\partial x}dt \]</span></li>
<li>Inverse function theorem -- If f is continuously differentiable in nbhd of <span class="math-tex">\( a \)</span> and its derivative is nonzero at <span class="math-tex">\( a \)</span> then the function is invertible and the inverse is continuously differentiable and the derivative is <span class="math-tex">\( 1/f'(f^{-1}(b)) \)</span></li>
<li>Implicit function theorem -- Given a system of m equations <span class="math-tex">\( \{f_i(x_1, \ldots, x_n, y_1, \ldots, y_m) = 0 \mid i=1,\ldots,m\} \)</span> satisfied at point <span class="math-tex">\( (\bar{a},\bar{b}) \)</span> under a mild condition on the partial derivateives with respect to <span class="math-tex">\( y_i \)</span> (that the jacobian of partial derivatives wrt to y is intertible) and the system itself is continuously differentiable in a nbhd of <span class="math-tex">\( (\bar{a},\bar{b}) \)</span> then the y variables are unique continuously differentiable functions of <span class="math-tex">\( \{ x_j \} \)</span> in some nbhd of the point. And infact this theorem also gives the formula for the jacobian of this unique function.</li>
</ul>
</details>
<details><summary>Difference between testing for non-stationarity tests vs testing for trend</summary>
<p>Stationarity simply means that the n-th order joint pdfs are independent of time. Wide-sense stationarity (or loose sense stationarity) means that the mean is constant with time, and that the correlation between any two observations only varies as a function of time. Stationarity does not mean that the the correlation function (R<sub>X</sub> (T<sub>1</sub>, 𝛕= T<sub>2</sub> - T<sub>1</sub>)) is zero or delta function. No! Wide-Sense Stationarity just means that the correlation function is invariant with time, only depends on the difference. Note that stationarity / non-stationarity describes a process, not the actual observed signal. The observed signals can still exhibit periodicity while being w-stationary. Now In an auto-regressive (AR) process, the current value of the signal is a linear function of past observations plus a noise term. This process can be non-stationary if the system has a so-called "unit-root". but AR processes <strong>may</strong> still be stationary without trend. OTOH Moving average processes (MA) are always defined as linear combinations of observations coming from some hidden iid error terms. So the MA process is <strong>always</strong> stationary. Dickey Fuller procedure tests whether a unit-root it present in an AR process. There are different versions of this test depending on exactly what type of AR process is assumed and even what is the alternative to the null hypothesis such as Augmented Dickey Fuller, KPSS. But none of these tests can check for whether a trend is present in the dataset because these "stationarity" tests can give false-positives for trend. If a *DF test or KPSS test says that the signal is stationary then rest assured we can say that no trend exists, but when these tests say that a trend exists, then it's possible that they are just detecting a unit-root random walk which obviously does not have actual trend.</p>
</details>
<details><summary>Hessian with Backprop.</summary>
<p><iframe src="/ViewerJS/?zoom=page-width#../res/trivia/hessian-with-backprop.pdf" loading="lazy" width="720" height="400"></iframe></p>
</details>
<details><summary>The delta method.</summary>
<p><iframe src="/ViewerJS/?zoom=page-width#../res/trivia/the-delta-method.pdf" loading="lazy" width="720" height="400"></iframe></p>
</details>
<details><summary>Variational characterization of the absolute value function.</summary>
<p><iframe src="/ViewerJS/?zoom=page-width#../res/trivia/variational-characterization-of-absolute-value.pdf" loading="lazy" width="720" height="400"></iframe></p>
</details>
<details><summary>The T distribution and its relation to sampling.</summary>
<p><iframe src="/ViewerJS/?zoom=page-width#../res/trivia/the-t-distribution.pdf" loading="lazy" width="720" height="400"></iframe></p>
</details>
<h2>Hobby Software/Hardware Projects (Personal)</h2>
<details><summary>FlagPlanter / Timestamper App for deepfake prevention, Evidence Timestamping with working apk</summary>
<p><a href="https://github.com/pushpendre/flagplant/">The flag planter app help to claim priority over an idea without revealing the idea itself to the world.</a></p>
</details>
<h2>Hobby Software/Hardware Projects (Others)</h2>
<p>▷ Hosted Jsmol: an open-source Javascript viewer for chemical structures in 3D. <a href="https://pushpendre.github.io/res/bio/jsmol/">link.</a></p>
<p>▷ Hosted JSME: an open-source Javascript Molecule Editor. <a href="https://pushpendre.github.io/res/bio/jsme/">link.</a></p>
<h2>Diary</h2>
<details><summary>Mom</summary>
<p><a href="res/mom.html">My mom's untimely death.</a></p>
</details>
<details><summary>Dad</summary>
<p><a href="res/dad.html">How we navigated my father's diagnosis for cardiac ischemia.</a></p>
</details>
</div>
</div>
</body>
</html>