+
+Dr. Carlos Maltzahn is the founder and director of the UC Santa Cruz Center for Research in Open Source Software (CROSS). Dr. Maltzahn also co-founded the Systems Research Lab, known for its cutting-edge work on programmable storage systems, big data storage & processing, scalable data management, distributed system performance management, and practical reproducible evaluation of computer systems. Carlos joined UC Santa Cruz in 2004, after five years at Netapp working on network-intermediaries and storage systems. In 2005 he co-founded and became a key mentor on [Sage Weil](https://en.wikipedia.org/wiki/Sage_Weil)’s [Ceph project](https://ceph.io). In 2008 Carlos became a member of the computer science faculty at UC Santa Cruz and has graduated ten Ph.D. students since. Carlos graduated with a M.S. and Ph.D. in Computer Science from University of Colorado at Boulder.
+
+His work is funded by nonprofits, government, and industry, including [NSF TI-2229773](https://www.nsf.gov/awardsearch/showAward?AWD_ID=2229773), [NSF OAC-2226407](https://www.nsf.gov/awardsearch/showAward?AWD_ID=2226407), the [Alfred P. Sloan Foundation G-2021-16957](https://sloan.org/grant-detail/9723), DOE ASCR DE-NA0003525 (FWP 20-023266, subcontractor of Sandia National Labs), [NSF OAC-1836650](https://www.nsf.gov/awardsearch/showAward?AWD_ID=1836650), [NSF CNS-1764102](https://www.nsf.gov/awardsearch/showAward?AWD_ID=1764102), [NSF CNS-1705021](https://www.nsf.gov/awardsearch/showAward?AWD_ID=1705021), DOE ASCR DE-SC0016074, [NSF OAC-1450488](https://www.nsf.gov/awardsearch/showAward?AWD_ID=1450488), and [CROSS](https://cross.ucsc.edu)
-Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed neque elit, tristique placerat feugiat ac, facilisis vitae arcu. Proin eget egestas augue. Praesent ut sem nec arcu pellentesque aliquet. Duis dapibus diam vel metus tempus vulputate.
+For more details, you can read his [vitae](https://drive.google.com/file/d/0B5rZ7hI6vXv3R2p0SEYxZlk5Vjg/view?usp=sharing).
diff --git a/content/authors/admin/avatar.jpg b/content/authors/admin/avatar.jpg
index d1361fd8e04..ed67f443622 100644
Binary files a/content/authors/admin/avatar.jpg and b/content/authors/admin/avatar.jpg differ
diff --git a/content/authors/carlosm/_index.md b/content/authors/carlosm/_index.md
new file mode 100644
index 00000000000..9a7d94e8f09
--- /dev/null
+++ b/content/authors/carlosm/_index.md
@@ -0,0 +1,180 @@
+---
+# Display name
+title: Carlos Maltzahn
+
+# Username (this should match the folder name)
+author: carlosm
+authors:
+- carlosm
+
+# Is this the primary user of the site?
+superuser: true
+
+# Role/position
+role: Retired Adjunct Professor, Sage Weil Presidential Chair for Open Source Software, Founding Director of CROSS, OSPO
+
+# Organizations/Affiliations
+organizations:
+- name: Open Source Program Office, UC Santa Cruz (OSPO)
+ url: "https://ospo.ucsc.edu"
+- name: Center for Research in Open Source Software (CROSS)
+ url: "https://cross.ucsc.edu"
+- name: Department of Computer Science & Engineering
+ url: "https://engineering.ucsc.edu/departments/computer-science-and-engineering"
+- name: Baskin School of Engineering
+ url: "https://engineering.ucsc.edu"
+- name: University of California, Santa Cruz
+ url: "https://www.ucsc.edu"
+
+
+# Short bio (displayed in user profile at end of posts)
+bio: My research interests include programmable storage systems, big data storage & processing, scalable data management, distributed systems performance management, and practical reproducible research.
+
+#education:
+# courses:
+# - course: M.S. Computer Science, University of Colorado at Boulder
+# year: 1997
+# - course: Ph.D. Computer Science, University of Colorado at Boulder
+# year: 1999
+
+administration:
+- "[Stephanie Lieggi](https://www.linkedin.com/in/stephanie-lieggi-8542624/)"
+# - "[Lavinia Preston](https://www.linkedin.com/in/lavinia-preston-60806b127/)"
+
+researchers:
+- "[Oskar Elek](https://elek.pub)"
+# - "[Kate Compton](https://www.galaxykate.com/)"
+# - "[Ivo Jimenez](http://www.linkedin.com/in/ivotron)"
+
+currentphds:
+- "[Jayjeet Chakraborty](https://www.linkedin.com/in/jayjeet-chakraborty-077579162/) (current advisor: [Heiner Litz](https://www.linkedin.com/in/heiner-litz-3a332713/))"
+- "[Esmaeil Mirvakili](https://www.linkedin.com/in/esmaeil-m-12a71879/) (current advisor: [Chen Qian](https://www.linkedin.com/in/chen-qian-7b59b521/))"
+- "[Farid Zakaria](https://www.linkedin.com/in/fmzakari/) (current advisor: [Andrew Quinn](https://arquinn.github.io/))"
+
+# currentmss:
+# - "[Saloni Rane](https://www.linkedin.com/in/saloni-rane/)"
+
+graduatedphds:
+ courses:
+ - course: "[Alexander Ames](http://www.linkedin.com/in/sashaames)"
+ year: "2011 ([thesis](https://search.proquest.com/docview/926578311))"
+ - course: "[Joe Buck](http://www.linkedin.com/pub/joe-buck/3/70a/97a)"
+ year: "2014 ([thesis](http://escholarship.org/uc/item/2gn5x6df))"
+ - course: "[Adam Crume](http://www.linkedin.com/pub/adam-crume/48/7b3/330)"
+ year: "2015 ([thesis](http://escholarship.org/uc/item/9gs8x5n8), [slides](https://drive.google.com/file/d/0B5rZ7hI6vXv3OEhoZVFxNzQ4Qlk/view?usp=sharing&resourcekey=0-Fmqd5UR_6EHJVPpWFHW9Dg))"
+ - course: "[Latchesar Ionkov](http://www.linkedin.com/pub/latchesar-ionkov/2/b9b/768)"
+ year: "2018 ([thesis](https://escholarship.org/uc/item/4vs7g3pk), [slides](https://drive.google.com/file/d/1HsZAFm6RDHI_sZlqx0L94vXoUMO03By8/view?usp=sharing))"
+ - course: "[Ivo Jimenez](http://www.linkedin.com/in/ivotron)"
+ year: "2019 ([thesis](https://escholarship.org/uc/item/8206n6nz), [slides](https://docs.google.com/presentation/d/16SDV4etFvGVRmxuPNns97ivSJv1IleJd/edit?usp=sharing&ouid=105297454540541468964&rtpof=true&sd=true))"
+ - course: "[Jianshen Liu](https://www.linkedin.com/in/jianshenliu/)"
+ year: "2023 ([thesis](https://escholarship.org/uc/item/8nt6c6pj), [slides](https://docs.google.com/presentation/d/16AaXtVfVrPRFy8EOTcC90LMYpuQnpDZI/edit?usp=sharing&ouid=105297454540541468964&rtpof=true&sd=true))"
+ - course: "[Michael Sevilla](http://www.linkedin.com/in/michaelandrewsevilla)"
+ year: "2018 ([thesis](https://escholarship.org/uc/item/3wd0x3b4), [slides](https://docs.google.com/presentation/d/1pbVKC8HLvuihNpf6NJ3SS1rtc7O2DHgrEB2Oj3ZiWyo/edit?usp=sharing))"
+ - course: "[Andrew Shewmaker](http://www.linkedin.com/in/ashewmaker)"
+ year: "2016 ([thesis](http://escholarship.org/uc/item/2dz5d7w4))"
+ - course: "[Dimitrios Skourtis](http://www.linkedin.com/in/skourtis)"
+ year: "2014 ([thesis](http://escholarship.org/uc/item/4wj9r8np))"
+ - course: "[Noah M. Watkins](http://www.linkedin.com/in/noahwatkins)"
+ year: "2018 ([thesis](https://escholarship.org/uc/item/72n6c5kq), [slides](https://docs.google.com/presentation/d/1wlhs59LS1bEzynKK_yaE_NSD2jgf9eeH/edit?usp=sharing&ouid=105297454540541468964&rtpof=true&sd=true))"
+
+graduatedmss:
+ courses:
+ - course: "[Trivikram Bollempalli](https://www.linkedin.com/in/trivikram-bollempalli-079a375b/)"
+ year: "2017"
+ - course: "[Zheyuan Chen](https://www.linkedin.com/in/zheyuan-chen/)"
+ year: "2017"
+ - course: "[Xiaowei Chu](https://www.linkedin.com/in/xweichu/)"
+ year: "2021"
+ - course: "[Abhishek Grover](https://www.linkedin.com/in/abhishek-grover-8183a024/)"
+ year: "2016"
+ - course: "[Bettie Jea](https://www.linkedin.com/in/bettiejea/)"
+ year: "2017"
+ - course: "[Neha Ojha](https://www.linkedin.com/in/nehaojha/)"
+ year: "2017"
+ - course: "[Saloni Rane](https://www.linkedin.com/in/saloni-rane/)"
+ year: "2020"
+ - course: "[Mariette Souppe](https://www.linkedin.com/in/msouppe/)"
+ year: "2019"
+ - course: "[Greeshma Swaminathan](https://www.linkedin.com/in/greeshmaswaminathan/)"
+ year: "2017"
+ - course: "[Ranjan S. Venkatesh](https://www.linkedin.com/in/ranjansv/)"
+ year: "2013"
+ - course: "[Haiyu Yang](https://www.linkedin.com/in/haiyu-yang-120652b4/)"
+ year: "2017"
+ - course: "[Yiming Zhang](https://www.linkedin.com/in/yiming-steven-zhang/)"
+ year: "2021"
+
+# - course: "[Alexander Ames](http://www.linkedin.com/in/sashaames)" #
+# year: "[2011](https://search.proquest.com/docview/926578311)"
+
+# Social/Academic Networking
+# For available icons, see: https://sourcethemes.com/academic/docs/widgets/#icons
+# For an email link, use "fas" icon pack, "envelope" icon, and a link in the
+# form "mailto:your-email@example.com" or "#contact" for contact widget.
+social:
+- icon: envelope
+ icon_pack: fas
+ link: '#contact' # For a direct email link, use "mailto:test@example.org".
+- icon: calendar-week
+ icon_pack: fas
+ link: 'calendar'
+- icon: home
+ icon_pack: fas
+ link: https://people.ucsc.edu/carlosm
+- icon: google-scholar
+ icon_pack: ai
+ link: https://scholar.google.co.uk/citations?hl=en&user=ntU7-hwAAAAJ
+- icon: arxiv
+ icon_pack: ai
+ link: https://arxiv.org/search/?query=Carlos+Maltzahn&searchtype=author&abstracts=show&order=-announced_date_first&size=50
+- icon: acmdl
+ icon_pack: ai
+ link: https://dl.acm.org/profile/81452607446
+- icon: dblp
+ icon_pack: ai
+ link: https://dblp.org/pid/83/2134.html?q=Carlos%20Maltzahn
+- icon: archive
+ icon_pack: ai
+ link: https://scholar.archive.org/search?q=Carlos%20Maltzahn
+- icon: linkedin
+ icon_pack: fab
+ link: https://www.linkedin.com/in/carlosmaltzahn/
+- icon: mastodon
+ icon_pack: fab
+ link: https://discuss.systems/@Carlos
+- icon: github
+ icon_pack: fab
+ link: https://github.com/carlosmalt
+- icon: dryad
+ icon_pack: ai
+ link: https://www.genealogy.math.ndsu.nodak.edu/id.php?id=92188
+#- icon: researchgate
+# icon_pack: ai
+# link: https://www.researchgate.net/profile/Carlos_Maltzahn
+- icon: orcid
+ icon_pack: fab
+ link: https://orcid.org/0000-0001-8305-0748
+# Link to a PDF of your resume/CV from the About widget.
+# To enable, copy your resume/CV to `static/files/cv.pdf` and uncomment the lines below.
+- icon: cv
+ icon_pack: ai
+ link: https://drive.google.com/file/d/0B5rZ7hI6vXv3R2p0SEYxZlk5Vjg/view?usp=sharing
+
+# Enter email to display Gravatar (if Gravatar enabled in Config)
+email: ""
+
+# Organizational groups that you belong to (for People widget)
+# Set this to `[]` or comment out if you are not using People widget.
+user_groups:
+- Principal Investigator
+---
+
+
+Carlos Maltzahn [retired](/news/20231215/) from UC Santa Cruz on December 15, 2023. He was the founder and director of the [UC Santa Cruz Center for Research in Open Source Software (CROSS)](https://cross.ucsc.edu) and the [Open Source Program Office (OSPO) UC Santa Cruz](/). He also co-founded the Systems Research Lab, known for its cutting-edge work on programmable storage systems, big data storage & processing, scalable data management, distributed system performance management, and practical reproducible evaluation of computer systems. Carlos joined UC Santa Cruz in 2004, after five years at Netapp working on network-intermediaries and storage systems. In 2005 he co-founded and became a key mentor on [Sage Weil](https://en.wikipedia.org/wiki/Sage_Weil)’s [Ceph project](https://ceph.io). In 2008 Carlos became a member of the computer science faculty at UC Santa Cruz and has graduated ten Ph.D. students since. Carlos graduated with a M.S. and Ph.D. in Computer Science from University of Colorado at Boulder.
+
+His work was funded by nonprofits, government, and industry, including DOE ASCR DE-AC52-07NA27344 (subcontract B661322 under master agreement B654505, subcontractor of Lawrence Livermore National Lab), [DOE HEP DE-SC0023527](https://watchep.org), [NSF TI-2229773](https://www.nsf.gov/awardsearch/showAward?AWD_ID=2229773), [NSF OAC-2226407](https://www.nsf.gov/awardsearch/showAward?AWD_ID=2226407), the [Alfred P. Sloan Foundation G-2021-16957](https://sloan.org/grant-detail/9723), DOE ASCR DE-NA0003525 (FWP 20-023266, subcontractor of Sandia National Labs), [NSF OAC-1836650](https://www.nsf.gov/awardsearch/showAward?AWD_ID=1836650), [NSF CNS-1764102](https://www.nsf.gov/awardsearch/showAward?AWD_ID=1764102), [NSF CNS-1705021](https://www.nsf.gov/awardsearch/showAward?AWD_ID=1705021), DOE ASCR DE-SC0016074, [NSF OAC-1450488](https://www.nsf.gov/awardsearch/showAward?AWD_ID=1450488), and [CROSS](https://cross.ucsc.edu)
+
+For more details, you can read his [vitae](https://drive.google.com/file/d/0B5rZ7hI6vXv3R2p0SEYxZlk5Vjg/view?usp=sharing).
diff --git a/content/authors/carlosm/avatar.jpg b/content/authors/carlosm/avatar.jpg
new file mode 100644
index 00000000000..ed67f443622
Binary files /dev/null and b/content/authors/carlosm/avatar.jpg differ
diff --git a/content/authors/fmzakari/_index.md b/content/authors/fmzakari/_index.md
new file mode 100644
index 00000000000..ce544214be5
--- /dev/null
+++ b/content/authors/fmzakari/_index.md
@@ -0,0 +1,48 @@
+---
+# Display name
+title: Farid Zakaria
+
+# Username (this should match the folder name)
+authors:
+- fmzakari
+
+# Is this the primary user of the site?
+superuser: false
+
+# Role/position
+role: Phd Student
+
+# Organizations/Affiliations
+organizations:
+- name: Department of Computer Science & Engineering
+ url: "https://www.soe.ucsc.edu/departments/computer-science-and-engineering"
+- name: Jack Baskin School of Engineering
+ url: "https://www.soe.ucsc.edu"
+- name: University of California, Santa Cruz
+ url: "https://www.ucsc.edu"
+
+
+# Short bio (displayed in user profile at end of posts)
+bio:
+
+# Social/Academic Networking
+# For available icons, see: https://sourcethemes.com/academic/docs/widgets/#icons
+# For an email link, use "fas" icon pack, "envelope" icon, and a link in the
+# form "mailto:your-email@example.com" or "#contact" for contact widget.
+social:
+- icon: envelope
+ icon_pack: fas
+ link: mailto:fmzakari@ucsc.edu
+- icon: linkedin
+ icon_pack: fab
+ link: https://www.linkedin.com/in/fmzakari/
+
+
+# Enter email to display Gravatar (if Gravatar enabled in Config)
+email: ""
+
+# Organizational groups that you belong to (for People widget)
+# Set this to `[]` or comment out if you are not using People widget.
+user_groups:
+- Current PhD Students
+---
diff --git a/content/authors/fmzakari/avatar.jpg b/content/authors/fmzakari/avatar.jpg
new file mode 100644
index 00000000000..8faedacf4c1
Binary files /dev/null and b/content/authors/fmzakari/avatar.jpg differ
diff --git a/content/authors/inassi/_index.md b/content/authors/inassi/_index.md
new file mode 100644
index 00000000000..22fefd1424d
--- /dev/null
+++ b/content/authors/inassi/_index.md
@@ -0,0 +1,50 @@
+---
+# Display name
+title: Ike Nassi
+
+# Username (this should match the folder name)
+authors:
+- inassi
+
+# Is this the primary user of the site?
+superuser: false
+
+# Role/position
+role: Adjunct Professor
+
+# Organizations/Affiliations
+organizations:
+- name: Department of Computer Science & Engineering
+ url: "https://www.soe.ucsc.edu/departments/computer-science-and-engineering"
+ url: "https://www.soe.ucsc.edu"
+- name: University of California, Santa Cruz
+ url: "https://www.ucsc.edu"
+
+
+# Short bio (displayed in user profile at end of posts)
+bio:
+
+# Social/Academic Networking
+# For available icons, see: https://sourcethemes.com/academic/docs/widgets/#icons
+# For an email link, use "fas" icon pack, "envelope" icon, and a link in the
+# form "mailto:your-email@example.com" or "#contact" for contact widget.
+social:
+- icon: envelope
+ icon_pack: fas
+ link: mailto:inassi@ucsc.edu
+- icon: home
+ icon_pack: fas
+ link: https://www.nassi.com/
+- icon: linkedin
+ icon_pack: fab
+ link: https://www.linkedin.com/in/ikenassi/
+
+
+# Enter email to display Gravatar (if Gravatar enabled in Config)
+email: ""
+
+# Organizational groups that you belong to (for People widget)
+# Set this to `[]` or comment out if you are not using People widget.
+user_groups:
+- Faculty
+---
diff --git a/content/authors/inassi/avatar.jpg b/content/authors/inassi/avatar.jpg
new file mode 100644
index 00000000000..b765cfa18b4
Binary files /dev/null and b/content/authors/inassi/avatar.jpg differ
diff --git a/content/authors/ivo/_index.md b/content/authors/ivo/_index.md
new file mode 100644
index 00000000000..05ccc2c23b1
--- /dev/null
+++ b/content/authors/ivo/_index.md
@@ -0,0 +1,62 @@
+---
+# Display name
+title: Ivo Jimenez
+
+# Username (this should match the folder name)
+authors:
+- ivo
+
+# Is this the primary user of the site?
+superuser: false
+
+# Role/position
+role: Assistant Adjunct Professor
+
+# Organizations/Affiliations
+organizations:
+- name: Center for Research in Open Source Software (CROSS)
+ url: "https://cross.ucsc.edu"
+- name: Jack Baskin School of Engineering
+ url: "https://www.soe.ucsc.edu"
+- name: University of California, Santa Cruz
+ url: "https://www.ucsc.edu"
+
+
+# Short bio (displayed in user profile at end of posts)
+bio:
+
+# Social/Academic Networking
+# For available icons, see: https://sourcethemes.com/academic/docs/widgets/#icons
+# For an email link, use "fas" icon pack, "envelope" icon, and a link in the
+# form "mailto:your-email@example.com" or "#contact" for contact widget.
+social:
+- icon: envelope
+ icon_pack: fas
+ link: mailto:ivo@cs.ucsc.edu
+- icon: home
+ icon_pack: fas
+ link: https://users.soe.ucsc.edu/~ivo/
+- icon: twitter
+ icon_pack: fab
+ link: https://twitter.com/ivotron
+- icon: google-scholar
+ icon_pack: ai
+ link: https://scholar.google.co.uk/citations?user=_f4sYhoAAAAJ
+- icon: github
+ icon_pack: fab
+ link: https://github.com/ivotron
+- icon: linkedin
+ icon_pack: fab
+ link: http://www.linkedin.com/in/ivotron
+- icon: orcid
+ icon_pack: fab
+ link: https://orcid.org/0000-0002-2222-1985
+
+# Enter email to display Gravatar (if Gravatar enabled in Config)
+email: ""
+
+# Organizational groups that you belong to (for People widget)
+# Set this to `[]` or comment out if you are not using People widget.
+user_groups:
+# - Faculty
+---
diff --git a/content/authors/ivo/avatar.png b/content/authors/ivo/avatar.png
new file mode 100644
index 00000000000..5c82cc7662d
Binary files /dev/null and b/content/authors/ivo/avatar.png differ
diff --git a/content/authors/jchakra1/_index.md b/content/authors/jchakra1/_index.md
new file mode 100644
index 00000000000..59c9943c7b3
--- /dev/null
+++ b/content/authors/jchakra1/_index.md
@@ -0,0 +1,49 @@
+---
+# Display name
+title: Jayjeet Chakraborty
+
+# Username (this should match the folder name)
+authors:
+- jchakra1
+
+# Is this the primary user of the site?
+superuser: false
+
+# Role/position
+role: PhD Student
+
+# Organizations/Affiliations
+organizations:
+- name: Department of Computer Science & Engineering
+ url: "https://www.soe.ucsc.edu/departments/computer-science-and-engineering"
+- name: Center for Research in Open Source Software (CROSS)
+ url: "https://cross.ucsc.edu"
+- name: Jack Baskin School of Engineering
+ url: "https://www.soe.ucsc.edu"
+- name: University of California, Santa Cruz
+ url: "https://www.ucsc.edu"
+
+
+# Short bio (displayed in user profile at end of posts)
+bio:
+
+# Social/Academic Networking
+# For available icons, see: https://sourcethemes.com/academic/docs/widgets/#icons
+# For an email link, use "fas" icon pack, "envelope" icon, and a link in the
+# form "mailto:your-email@example.com" or "#contact" for contact widget.
+social:
+- icon: envelope
+ icon_pack: fas
+ link: mailto:jchakra1@ucsc.edu
+- icon: linkedin
+ icon_pack: fab
+ link: https://www.linkedin.com/in/jianshenliu/
+
+# Enter email to display Gravatar (if Gravatar enabled in Config)
+email: ""
+
+# Organizational groups that you belong to (for People widget)
+# Set this to `[]` or comment out if you are not using People widget.
+user_groups:
+- Current PhD Students
+---
diff --git a/content/authors/jchakra1/avatar.jpg b/content/authors/jchakra1/avatar.jpg
new file mode 100644
index 00000000000..1362007fef5
Binary files /dev/null and b/content/authors/jchakra1/avatar.jpg differ
diff --git a/content/authors/jlefevre/_index.md b/content/authors/jlefevre/_index.md
new file mode 100644
index 00000000000..50985cd066d
--- /dev/null
+++ b/content/authors/jlefevre/_index.md
@@ -0,0 +1,64 @@
+---
+# Display name
+title: Jeff LeFevre
+
+# Username (this should match the folder name)
+authors:
+- jlefevre
+
+# Is this the primary user of the site?
+superuser: false
+
+# Role/position
+role: Assistant Adjunct Professor
+
+# Organizations/Affiliations
+organizations:
+- name: Department of Computer Science & Engineering
+ url: "https://www.soe.ucsc.edu/departments/computer-science-and-engineering"
+- name: Center for Research in Open Source Software (CROSS)
+ url: "https://cross.ucsc.edu"
+- name: Jack Baskin School of Engineering
+ url: "https://www.soe.ucsc.edu"
+- name: University of California, Santa Cruz
+ url: "https://www.ucsc.edu"
+
+
+# Short bio (displayed in user profile at end of posts)
+bio:
+
+# Social/Academic Networking
+# For available icons, see: https://sourcethemes.com/academic/docs/widgets/#icons
+# For an email link, use "fas" icon pack, "envelope" icon, and a link in the
+# form "mailto:your-email@example.com" or "#contact" for contact widget.
+social:
+- icon: envelope
+ icon_pack: fas
+ link: mailto:jlefevre@ucsc.edu
+- icon: home
+ icon_pack: fas
+ link: https://users.soe.ucsc.edu/~jlefevre/
+- icon: google-scholar
+ icon_pack: ai
+ link: https://scholar.google.co.uk/citations?hl=en&user=aS_jReUAAAAJ
+- icon: linkedin
+ icon_pack: fab
+ link: https://www.linkedin.com/in/jefflefevre/
+- icon: github
+ icon_pack: fab
+ link: https://github.com/jlefevre
+
+
+# Enter email to display Gravatar (if Gravatar enabled in Config)
+email: ""
+
+# Organizational groups that you belong to (for People widget)
+# Set this to `[]` or comment out if you are not using People widget.
+user_groups:
+- Faculty
+---
+My research interests are in cloud databases, database physical design, and storage systems. I currently lead the [Skyhook: programmable storage for databases project](https://skyhookdm.com) as part of the [Center for Research on Open Source Software](https://cross.ucsc.edu) at UC Santa Cruz. Skyhook is an open source project that extends Ceph distributed object storage with customized data management functions. I also collaborate with the Systems Research Lab on the larger [programmable storage](http://www.programmability.us/) effort. The Skyhook project is also part of [Google Summer of Code 2019](https://summerofcode.withgoogle.com/organizations/4813203146539008/), where we mentor a student working on data formats for Skyhook.
+
+I received my PhD in June 2014 from UC Santa Cruz Database group and subsequently joined Hewlett Packard Big Data R&D (Vertica database) where I worked on integrating Vertica with external analtyics engines such as Distributed-R and Apache Spark. At UC Santa Cruz my PhD advisor was [Neoklis Polyzotis](https://research.google.com/pubs/NeoklisPolyzotis.html) and my PhD thesis title is "Physical design tuning methods for emerging system architectures". My [thesis](http://escholarship.org/uc/item/7ck0q3nn) ([abstract](https://users.soe.ucsc.edu/~jlefevre/abstract.txt)) introduces new physical design methods for databases in the cloud. Specifically I address RDBMS, Hadoop, and hybrid 'multistore' (combined RDBMS + Hadoop co-processing) system architectures.
+
+Previously, I received my MS from the [University of California, San Diego](http://www-cse.ucsd.edu/) in the [Systems and Networking Group](http://www.sysnet.ucsd.edu/sysnet/). My MS advisor was [Walt Burkhard](http://www.jacobsschool.ucsd.edu/faculty/faculty_bios/index.sfe?fmp_recid=100) and my MS thesis title is "Improving disk array performance and reliability", which introduces a data layout and scheduling policy for RAID arrays. I received a BS in Computer Science & Engineering from the [University of South Florida](http://www.cse.usf.edu/), where I did research on unique encodings for DNA languages. During graduate school I spent several summers at NEC Labs working on CloudDB in the Data Management group, at Google in the Platforms Storage group, and at Teradata in the Virtual Storage Architecture group.
diff --git a/content/authors/jlefevre/avatar.jpg b/content/authors/jlefevre/avatar.jpg
new file mode 100644
index 00000000000..307a03a0822
Binary files /dev/null and b/content/authors/jlefevre/avatar.jpg differ
diff --git a/content/authors/jliu120/_index.md b/content/authors/jliu120/_index.md
new file mode 100644
index 00000000000..965869e3fd9
--- /dev/null
+++ b/content/authors/jliu120/_index.md
@@ -0,0 +1,49 @@
+---
+# Display name
+title: Jianshen Liu
+
+# Username (this should match the folder name)
+authors:
+- jliu120
+
+# Is this the primary user of the site?
+superuser: false
+
+# Role/position
+role: PhD Student, CROSS Research Fellow
+
+# Organizations/Affiliations
+organizations:
+- name: Department of Computer Science & Engineering
+ url: "https://www.soe.ucsc.edu/departments/computer-science-and-engineering"
+- name: Center for Research in Open Source Software (CROSS)
+ url: "https://cross.ucsc.edu"
+- name: Jack Baskin School of Engineering
+ url: "https://www.soe.ucsc.edu"
+- name: University of California, Santa Cruz
+ url: "https://www.ucsc.edu"
+
+
+# Short bio (displayed in user profile at end of posts)
+bio:
+
+# Social/Academic Networking
+# For available icons, see: https://sourcethemes.com/academic/docs/widgets/#icons
+# For an email link, use "fas" icon pack, "envelope" icon, and a link in the
+# form "mailto:your-email@example.com" or "#contact" for contact widget.
+social:
+- icon: envelope
+ icon_pack: fas
+ link: mailto:jliu120@ucsc.edu
+- icon: linkedin
+ icon_pack: fab
+ link: https://www.linkedin.com/in/jianshenliu/
+
+# Enter email to display Gravatar (if Gravatar enabled in Config)
+email: ""
+
+# Organizational groups that you belong to (for People widget)
+# Set this to `[]` or comment out if you are not using People widget.
+user_groups:
+#- Current PhD Students
+---
diff --git a/content/authors/jliu120/avatar.jpg b/content/authors/jliu120/avatar.jpg
new file mode 100644
index 00000000000..910a407f552
Binary files /dev/null and b/content/authors/jliu120/avatar.jpg differ
diff --git a/content/authors/kecompto/_index.md b/content/authors/kecompto/_index.md
new file mode 100644
index 00000000000..ef1a6972ab3
--- /dev/null
+++ b/content/authors/kecompto/_index.md
@@ -0,0 +1,54 @@
+---
+# Display name
+title: Kate Compton
+
+# Username (this should match the folder name)
+authors:
+- kecompto
+
+# Is this the primary user of the site?
+superuser: false
+
+# Role/position
+role: Postdoc, CROSS Incubator Fellow
+
+# Organizations/Affiliations
+organizations:
+- name: Center for Research in Open Source Software (CROSS)
+ url: "https://cross.ucsc.edu"
+- name: Jack Baskin School of Engineering
+ url: "https://www.soe.ucsc.edu"
+- name: University of California, Santa Cruz
+ url: "https://www.ucsc.edu"
+
+
+# Short bio (displayed in user profile at end of posts)
+bio:
+
+# Social/Academic Networking
+# For available icons, see: https://sourcethemes.com/academic/docs/widgets/#icons
+# For an email link, use "fas" icon pack, "envelope" icon, and a link in the
+# form "mailto:your-email@example.com" or "#contact" for contact widget.
+social:
+- icon: envelope
+ icon_pack: fas
+ link: mailto:galaxykate@gmail.com
+- icon: home
+ icon_pack: fas
+ link: https://www.galaxykate.com/
+- icon: linkedin
+ icon_pack: fab
+ link: https://www.linkedin.com/in/katecompton/
+- icon: twitter
+ icon_pack: fab
+ link: https://twitter.com/galaxykate
+
+# Enter email to display Gravatar (if Gravatar enabled in Config)
+email: ""
+
+# Organizational groups that you belong to (for People widget)
+# Set this to `[]` or comment out if you are not using People widget.
+user_groups: []
+# - Researchers
+---
+Weird Futurist. Maker of many interesting things. Geologic choreographer, dance breeder, crafter of twitching generative bots. Ask me about JavaScript. She/her.
diff --git a/content/authors/kecompto/avatar.jpg b/content/authors/kecompto/avatar.jpg
new file mode 100644
index 00000000000..ae78a8cd89c
Binary files /dev/null and b/content/authors/kecompto/avatar.jpg differ
diff --git a/content/authors/lpreston/_index.md b/content/authors/lpreston/_index.md
new file mode 100644
index 00000000000..2093283758b
--- /dev/null
+++ b/content/authors/lpreston/_index.md
@@ -0,0 +1,47 @@
+---
+# Display name
+title: Lavinia Preston
+
+# Username (this should match the folder name)
+authors:
+- lpreston
+
+# Is this the primary user of the site?
+superuser: false
+
+# Role/position
+role: Administrative Assistant
+
+# Organizations/Affiliations
+organizations:
+- name: Center for Research in Open Source Software (CROSS)
+ url: "https://cross.ucsc.edu"
+- name: Jack Baskin School of Engineering
+ url: "https://www.soe.ucsc.edu"
+- name: University of California, Santa Cruz
+ url: "https://www.ucsc.edu"
+
+
+# Short bio (displayed in user profile at end of posts)
+bio:
+
+# Social/Academic Networking
+# For available icons, see: https://sourcethemes.com/academic/docs/widgets/#icons
+# For an email link, use "fas" icon pack, "envelope" icon, and a link in the
+# form "mailto:your-email@example.com" or "#contact" for contact widget.
+social:
+- icon: envelope
+ icon_pack: fas
+ link: mailto:lpreston@ucsc.edu
+- icon: linkedin
+ icon_pack: fab
+ link: https://www.linkedin.com/in/lavinia-preston-60806b127/
+
+# Enter email to display Gravatar (if Gravatar enabled in Config)
+email: ""
+
+# Organizational groups that you belong to (for People widget)
+# Set this to `[]` or comment out if you are not using People widget.
+#user_groups:
+#- Administration
+---
diff --git a/content/authors/lpreston/avatar.jpg b/content/authors/lpreston/avatar.jpg
new file mode 100644
index 00000000000..974c7c807e5
Binary files /dev/null and b/content/authors/lpreston/avatar.jpg differ
diff --git a/content/authors/oelek/_index.md b/content/authors/oelek/_index.md
new file mode 100644
index 00000000000..64a013ab6d4
--- /dev/null
+++ b/content/authors/oelek/_index.md
@@ -0,0 +1,56 @@
+---
+# Display name
+title: Oskar Elek
+
+# Username (this should match the folder name)
+author: "oelek"
+authors:
+- oelek
+
+# Is this the primary user of the site?
+superuser: false
+
+# Role/position
+role: OSPO Incubator Fellow, UCSC Postdoctoral Researcher
+
+# Organizations/Affiliations
+organizations:
+- name: Open Source Program Office, UC Santa Cruz (OSPO)
+ url: "https://ospo.ucsc.edu"
+- name: Department of Computer Science & Engineering
+ url: "https://engineering.ucsc.edu/departments/computer-science-and-engineering"
+- name: Baskin School of Engineering
+ url: "https://engineering.ucsc.edu"
+- name: University of California, Santa Cruz
+ url: "https://www.ucsc.edu"
+
+
+# Short bio (displayed in user profile at end of posts)
+bio: Oskar designs algorithms that draw from complex fractal systems and explores their applications in astrophysics and cosmology, as well as computational art and design.
+
+# Social/Academic Networking
+# For available icons, see: https://sourcethemes.com/academic/docs/widgets/#icons
+# For an email link, use "fas" icon pack, "envelope" icon, and a link in the
+# form "mailto:your-email@example.com" or "#contact" for contact widget.
+social:
+- icon: envelope
+ icon_pack: fas
+ link: mailto:oelek@ucsc.edu
+- icon: home
+ icon_pack: fas
+ link: https://elek.pub/
+- icon: linkedin
+ icon_pack: fab
+ link: https://www.linkedin.com/in/oskar-elek-9981b013/
+
+# Organizational groups that you belong to (for People widget)
+# Set this to `[]` or comment out if you are not using People widget.
+user_groups:
+- Researchers
+---
+
+Dr. Oskar Elek is a computational scientist and adjunct lecturer at University of California in Santa Cruz, departments of Computer Science and Computational Media. His expertise includes computer graphics, data visualization, numerical simulation and computational art. His work draws inspiration from the complexity of nature and fractal geometry.
+
+Oskar's main project is [PolyPhy](https://github.com/PolyPhyHub/PolyPhy), an open source Python-centric interdisciplinary software that builds on an earlier prototype [Polyphorm](https://github.com/CreativeCodingLab/Polyphorm). This work demonstrates that a multi-agent algorithm inspired by Physarum polycephalum "slime mold" can be used to reconstruct the intergalactic network of gas and dark matter known as the Cosmic web. The same algorithm can be used to reconstruct network structures in various kinds of data and even perform geometric and structural modeling.
+
+You can find all about this work on Oskar's [personal website](https://elek.pub), and contact him directly if you'd like to get involved in any capacity.
\ No newline at end of file
diff --git a/content/authors/oelek/avatar.jpg b/content/authors/oelek/avatar.jpg
new file mode 100644
index 00000000000..09e0ed9a003
Binary files /dev/null and b/content/authors/oelek/avatar.jpg differ
diff --git a/content/authors/palvaro/_index.md b/content/authors/palvaro/_index.md
new file mode 100644
index 00000000000..e29b5e265bf
--- /dev/null
+++ b/content/authors/palvaro/_index.md
@@ -0,0 +1,55 @@
+---
+# Display name
+title: Peter Alvaro
+
+# Username (this should match the folder name)
+authors:
+- palvaro
+
+# Is this the primary user of the site?
+superuser: false
+
+# Role/position
+role: Associate Professor
+
+# Organizations/Affiliations
+organizations:
+- name: Department of Computer Science & Engineering
+ url: "https://www.soe.ucsc.edu/departments/computer-science-and-engineering"
+ url: "https://www.soe.ucsc.edu"
+- name: Jack Baskin School of Engineering
+ url: "https://www.soe.ucsc.edu"
+- name: University of California, Santa Cruz
+ url: "https://www.ucsc.edu"
+
+
+# Short bio (displayed in user profile at end of posts)
+bio:
+
+# Social/Academic Networking
+# For available icons, see: https://sourcethemes.com/academic/docs/widgets/#icons
+# For an email link, use "fas" icon pack, "envelope" icon, and a link in the
+# form "mailto:your-email@example.com" or "#contact" for contact widget.
+social:
+- icon: envelope
+ icon_pack: fas
+ link: mailto:palvaro@ucsc.edu
+- icon: home
+ icon_pack: fas
+ link: https://people.ucsc.edu/~palvaro
+- icon: linkedin
+ icon_pack: fab
+ link: https://www.linkedin.com/in/peteralvaro/
+- icon: twitter
+ icon_pack: fab
+ link: https://twitter.com/palvaro
+
+
+# Enter email to display Gravatar (if Gravatar enabled in Config)
+email: ""
+
+# Organizational groups that you belong to (for People widget)
+# Set this to `[]` or comment out if you are not using People widget.
+user_groups:
+- Faculty
+---
diff --git a/content/authors/palvaro/avatar.jpg b/content/authors/palvaro/avatar.jpg
new file mode 100644
index 00000000000..6b5625a68f0
Binary files /dev/null and b/content/authors/palvaro/avatar.jpg differ
diff --git a/content/authors/sadepoju/_index.md b/content/authors/sadepoju/_index.md
new file mode 100644
index 00000000000..fdd045d4f49
--- /dev/null
+++ b/content/authors/sadepoju/_index.md
@@ -0,0 +1,48 @@
+---
+# Display name
+title: Saheed Adepoju
+
+# Username (this should match the folder name)
+authors:
+- sadepoju
+
+# Is this the primary user of the site?
+superuser: false
+
+# Role/position
+role: Phd Student
+
+# Organizations/Affiliations
+organizations:
+- name: Department of Computer Science & Engineering
+ url: "https://www.soe.ucsc.edu/departments/computer-science-and-engineering"
+- name: Jack Baskin School of Engineering
+ url: "https://www.soe.ucsc.edu"
+- name: University of California, Santa Cruz
+ url: "https://www.ucsc.edu"
+
+
+# Short bio (displayed in user profile at end of posts)
+bio:
+
+# Social/Academic Networking
+# For available icons, see: https://sourcethemes.com/academic/docs/widgets/#icons
+# For an email link, use "fas" icon pack, "envelope" icon, and a link in the
+# form "mailto:your-email@example.com" or "#contact" for contact widget.
+social:
+- icon: envelope
+ icon_pack: fas
+ link: mailto:sadepoju@ucsc.edu
+- icon: linkedin
+ icon_pack: fab
+ link: https://www.linkedin.com/in/saheedadepoju/
+
+
+# Enter email to display Gravatar (if Gravatar enabled in Config)
+email: ""
+
+# Organizational groups that you belong to (for People widget)
+# Set this to `[]` or comment out if you are not using People widget.
+user_groups:
+- Current PhD Students
+---
diff --git a/content/authors/sadepoju/avatar.jpg b/content/authors/sadepoju/avatar.jpg
new file mode 100644
index 00000000000..7b4dd227e4e
Binary files /dev/null and b/content/authors/sadepoju/avatar.jpg differ
diff --git a/content/authors/sarane/_index.md b/content/authors/sarane/_index.md
new file mode 100644
index 00000000000..bee34ffce12
--- /dev/null
+++ b/content/authors/sarane/_index.md
@@ -0,0 +1,47 @@
+---
+# Display name
+title: Saloni Rane
+
+# Username (this should match the folder name)
+authors:
+- sarane
+
+# Is this the primary user of the site?
+superuser: false
+
+# Role/position
+role: MS Student
+
+# Organizations/Affiliations
+organizations:
+- name: Department of Computer Science & Engineering
+ url: "https://www.soe.ucsc.edu/departments/computer-science-and-engineering"
+- name: Jack Baskin School of Engineering
+ url: "https://www.soe.ucsc.edu"
+- name: University of California, Santa Cruz
+ url: "https://www.ucsc.edu"
+
+
+# Short bio (displayed in user profile at end of posts)
+bio:
+
+# Social/Academic Networking
+# For available icons, see: https://sourcethemes.com/academic/docs/widgets/#icons
+# For an email link, use "fas" icon pack, "envelope" icon, and a link in the
+# form "mailto:your-email@example.com" or "#contact" for contact widget.
+social:
+- icon: envelope
+ icon_pack: fas
+ link: mailto:sarane@ucsc.edu
+- icon: linkedin
+ icon_pack: fab
+ link: https://www.linkedin.com/in/saloni-rane/
+
+
+# Enter email to display Gravatar (if Gravatar enabled in Config)
+email: ""
+
+# Organizational groups that you belong to (for People widget)
+# Set this to `[]` or comment out if you are not using People widget.
+user_groups: []
+---
diff --git a/content/authors/sarane/avatar.jpg b/content/authors/sarane/avatar.jpg
new file mode 100644
index 00000000000..98ba76b1c0c
Binary files /dev/null and b/content/authors/sarane/avatar.jpg differ
diff --git a/content/authors/slieggi/_index.md b/content/authors/slieggi/_index.md
new file mode 100644
index 00000000000..c38818f1eed
--- /dev/null
+++ b/content/authors/slieggi/_index.md
@@ -0,0 +1,52 @@
+---
+# Display name
+title: Stephanie Lieggi
+
+# Username (this should match the folder name)
+authors:
+- slieggi
+
+# Is this the primary user of the site?
+superuser: false
+
+# Role/position
+role: Executive Director of OSPO, Executive Director of CROSS
+
+# Organizations/Affiliations
+organizations:
+- name: Center for Research in Open Source Software (CROSS)
+ url: "https://cross.ucsc.edu"
+- name: Jack Baskin School of Engineering
+ url: "https://www.soe.ucsc.edu"
+- name: University of California, Santa Cruz
+ url: "https://www.ucsc.edu"
+
+
+# Short bio (displayed in user profile at end of posts)
+bio:
+
+# Social/Academic Networking
+# For available icons, see: https://sourcethemes.com/academic/docs/widgets/#icons
+# For an email link, use "fas" icon pack, "envelope" icon, and a link in the
+# form "mailto:your-email@example.com" or "#contact" for contact widget.
+social:
+- icon: envelope
+ icon_pack: fas
+ link: mailto:lieggi@ucsc.edu
+- icon: linkedin
+ icon_pack: fab
+ link: https://www.linkedin.com/in/stephanie-lieggi-8542624/
+- icon: twitter
+ icon_pack: fab
+ link: https://twitter.com/sclieggi
+
+
+# Enter email to display Gravatar (if Gravatar enabled in Config)
+email: ""
+
+# Organizational groups that you belong to (for People widget)
+# Set this to `[]` or comment out if you are not using People widget.
+user_groups:
+- Administration
+---
+Stephanie Lieggi has been Assistant Director of CROSS since 2016. Prior to coming to UCSC, she worked as a Senior Research Associate at the Center for Nonproliferation Studies at the Middlebury Institute of International Studies at Monterey. She has served as an editor of the CNS publications [Asian Export Control Observer](http://cns.miis.edu/pubs/observer/asian/) and [International Export Control Observer](http://cns.miis.edu/pubs/observer/). Previously, she worked at the [Organization for the Prohibition of Chemical Weapons](https://www.opcw.org/).
diff --git a/content/authors/slieggi/avatar.jpg b/content/authors/slieggi/avatar.jpg
new file mode 100644
index 00000000000..47b2eac81f7
Binary files /dev/null and b/content/authors/slieggi/avatar.jpg differ
diff --git a/content/authors/smirvaki/_index.md b/content/authors/smirvaki/_index.md
new file mode 100644
index 00000000000..60bb1b7e2be
--- /dev/null
+++ b/content/authors/smirvaki/_index.md
@@ -0,0 +1,48 @@
+---
+# Display name
+title: Esmaeil Mirvakili
+
+# Username (this should match the folder name)
+authors:
+- smirvaki
+
+# Is this the primary user of the site?
+superuser: false
+
+# Role/position
+role: Phd Student, CROSS Research Fellow
+
+# Organizations/Affiliations
+organizations:
+- name: Department of Computer Science & Engineering
+ url: "https://www.soe.ucsc.edu/departments/computer-science-and-engineering"
+- name: Jack Baskin School of Engineering
+ url: "https://www.soe.ucsc.edu"
+- name: University of California, Santa Cruz
+ url: "https://www.ucsc.edu"
+
+
+# Short bio (displayed in user profile at end of posts)
+bio:
+
+# Social/Academic Networking
+# For available icons, see: https://sourcethemes.com/academic/docs/widgets/#icons
+# For an email link, use "fas" icon pack, "envelope" icon, and a link in the
+# form "mailto:your-email@example.com" or "#contact" for contact widget.
+social:
+- icon: envelope
+ icon_pack: fas
+ link: mailto:smirvaki@ucsc.edu
+- icon: linkedin
+ icon_pack: fab
+ link: https://www.linkedin.com/in/esmaeil-m-12a71879/
+
+
+# Enter email to display Gravatar (if Gravatar enabled in Config)
+email: ""
+
+# Organizational groups that you belong to (for People widget)
+# Set this to `[]` or comment out if you are not using People widget.
+user_groups:
+- Current PhD Students
+---
diff --git a/content/authors/smirvaki/avatar.jpg b/content/authors/smirvaki/avatar.jpg
new file mode 100644
index 00000000000..d018d893e5e
Binary files /dev/null and b/content/authors/smirvaki/avatar.jpg differ
diff --git a/content/authors/xchu1/_index.md b/content/authors/xchu1/_index.md
new file mode 100644
index 00000000000..2cfd79d31b0
--- /dev/null
+++ b/content/authors/xchu1/_index.md
@@ -0,0 +1,48 @@
+---
+# Display name
+title: Xiaowei (Aaron) Chu
+
+# Username (this should match the folder name)
+authors:
+- xchu1
+
+# Is this the primary user of the site?
+superuser: false
+
+# Role/position
+role: MS Student
+
+# Organizations/Affiliations
+organizations:
+- name: Department of Computer Science & Engineering
+ url: "https://www.soe.ucsc.edu/departments/computer-science-and-engineering"
+- name: Jack Baskin School of Engineering
+ url: "https://www.soe.ucsc.edu"
+- name: University of California, Santa Cruz
+ url: "https://www.ucsc.edu"
+
+
+# Short bio (displayed in user profile at end of posts)
+bio:
+
+# Social/Academic Networking
+# For available icons, see: https://sourcethemes.com/academic/docs/widgets/#icons
+# For an email link, use "fas" icon pack, "envelope" icon, and a link in the
+# form "mailto:your-email@example.com" or "#contact" for contact widget.
+social:
+- icon: envelope
+ icon_pack: fas
+ link: mailto:xweichu@ucsc.edu
+- icon: linkedin
+ icon_pack: fab
+ link: https://www.linkedin.com/in/xweichu/
+
+
+# Enter email to display Gravatar (if Gravatar enabled in Config)
+email: ""
+
+# Organizational groups that you belong to (for People widget)
+# Set this to `[]` or comment out if you are not using People widget.
+#user_groups:
+#- Current PhD Students
+---
diff --git a/content/authors/xchu1/avatar.jpg b/content/authors/xchu1/avatar.jpg
new file mode 100644
index 00000000000..935791380d0
Binary files /dev/null and b/content/authors/xchu1/avatar.jpg differ
diff --git a/content/authors/yzhan298/.DS_Store b/content/authors/yzhan298/.DS_Store
new file mode 100644
index 00000000000..5008ddfcf53
Binary files /dev/null and b/content/authors/yzhan298/.DS_Store differ
diff --git a/content/authors/yzhan298/_index.md b/content/authors/yzhan298/_index.md
new file mode 100644
index 00000000000..06bc1996d83
--- /dev/null
+++ b/content/authors/yzhan298/_index.md
@@ -0,0 +1,48 @@
+---
+# Display name
+title: Yiming (Steven) Zhang
+
+# Username (this should match the folder name)
+authors:
+- yzhan298
+
+# Is this the primary user of the site?
+superuser: false
+
+# Role/position
+role: MS Student
+
+# Organizations/Affiliations
+organizations:
+- name: Department of Computer Science & Engineering
+ url: "https://www.soe.ucsc.edu/departments/computer-science-and-engineering"
+- name: Jack Baskin School of Engineering
+ url: "https://www.soe.ucsc.edu"
+- name: University of California, Santa Cruz
+ url: "https://www.ucsc.edu"
+
+
+# Short bio (displayed in user profile at end of posts)
+bio:
+
+# Social/Academic Networking
+# For available icons, see: https://sourcethemes.com/academic/docs/widgets/#icons
+# For an email link, use "fas" icon pack, "envelope" icon, and a link in the
+# form "mailto:your-email@example.com" or "#contact" for contact widget.
+social:
+- icon: envelope
+ icon_pack: fas
+ link: mailto:yzhan298@ucsc.edu
+- icon: linkedin
+ icon_pack: fab
+ link: https://www.linkedin.com/in/yiming-steven-zhang/
+
+
+# Enter email to display Gravatar (if Gravatar enabled in Config)
+email: ""
+
+# Organizational groups that you belong to (for People widget)
+# Set this to `[]` or comment out if you are not using People widget.
+#user_groups:
+#- Current PhD Students
+---
diff --git a/content/authors/yzhan298/avatar.jpg b/content/authors/yzhan298/avatar.jpg
new file mode 100644
index 00000000000..49a761bd1e9
Binary files /dev/null and b/content/authors/yzhan298/avatar.jpg differ
diff --git a/content/calendar.md b/content/calendar.md
new file mode 100644
index 00000000000..ea4255f092a
--- /dev/null
+++ b/content/calendar.md
@@ -0,0 +1,31 @@
+---
+# Documentation: https://sourcethemes.com/academic/docs/managing-content/
+
+title: "Calendar"
+subtitle: ""
+summary: ""
+authors: ["admin"]
+tags: []
+categories: []
+date: 2020-02-22T17:48:52-08:00
+lastmod: 2020-02-22T17:48:52-08:00
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Focal points: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight.
+image:
+ caption: ""
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects: []
+---
+
+
\ No newline at end of file
diff --git a/content/talk/_index.md b/content/event/_index.md
similarity index 100%
rename from content/talk/_index.md
rename to content/event/_index.md
diff --git a/content/home/about.md b/content/home/about.md
index 13d45394fa1..28cf8df70fd 100644
--- a/content/home/about.md
+++ b/content/home/about.md
@@ -2,10 +2,10 @@
# About widget.
widget = "about" # See https://sourcethemes.com/academic/docs/page-builder/
headless = true # This file represents a page section.
-active = true # Activate this widget? true/false
+active = false # Activate this widget? true/false
weight = 20 # Order that this section will appear in.
-title = "Biography"
+title = "Introduction"
# Choose the user profile to display
# This should be the username of a profile in your `content/authors/` folder.
diff --git a/content/home/aboutcm.md b/content/home/aboutcm.md
new file mode 100644
index 00000000000..a7688db5bdc
--- /dev/null
+++ b/content/home/aboutcm.md
@@ -0,0 +1,14 @@
++++
+# About widget.
+widget = "aboutcm" # See https://sourcethemes.com/academic/docs/page-builder/
+headless = true # This file represents a page section.
+active = true # Activate this widget? true/false
+weight = 10 # Order that this section will appear in.
+
+title = "Introduction"
+
+# Choose the user profile to display
+# This should be the username of a profile in your `content/authors/` folder.
+# See https://sourcethemes.com/academic/docs/get-started/#introduce-yourself
+author = "carlosm"
++++
diff --git a/content/home/accomplishments.md b/content/home/accomplishments.md
index e2fd9a21812..8fdba2c2adc 100644
--- a/content/home/accomplishments.md
+++ b/content/home/accomplishments.md
@@ -2,7 +2,7 @@
# Accomplishments widget.
widget = "accomplishments" # See https://sourcethemes.com/academic/docs/page-builder/
headless = true # This file represents a page section.
-active = true # Activate this widget? true/false
+active = false # Activate this widget? true/false
weight = 50 # Order that this section will appear.
title = "Accomplishments"
diff --git a/content/home/contact.md b/content/home/contact.md
index 6459d09f311..6429b85b45d 100644
--- a/content/home/contact.md
+++ b/content/home/contact.md
@@ -3,7 +3,7 @@
widget = "contact" # See https://sourcethemes.com/academic/docs/page-builder/
headless = true # This file represents a page section.
active = true # Activate this widget? true/false
-weight = 130 # Order that this section will appear.
+weight = 60 # Order that this section will appear.
title = "Contact"
subtitle = ""
@@ -15,6 +15,15 @@ autolink = true
# 0: Disable email form
# 1: Netlify (requires that the site is hosted by Netlify)
# 2: formspree.io
-email_form = 2
+email_form = 0
+++
+### Office
+Baskin Engineering 2
+Room 369 ([directions](https://users.soe.ucsc.edu/~carlosm/UCSC/Directions.html))
+Hours: By appointment ([calendar](calendar))
+Phone: +1 (831) 459-1627
+
+Public key: [1430E52A](http://users.soe.ucsc.edu/~carlosm/1430E52A.asc)
+
+[**Campus Directory**](https://campusdirectory.ucsc.edu/cd_detail?uid=carlosm)
diff --git a/content/home/demo.md b/content/home/demo.md
index 54800cf90f4..8e7fc1fad6c 100644
--- a/content/home/demo.md
+++ b/content/home/demo.md
@@ -5,7 +5,7 @@
widget = "blank" # See https://sourcethemes.com/academic/docs/page-builder/
headless = true # This file represents a page section.
-active = true # Activate this widget? true/false
+active = false # Activate this widget? true/false
weight = 15 # Order that this section will appear.
title = "Academic Kickstart"
@@ -23,18 +23,18 @@ subtitle = ""
# Background color.
# color = "navy"
-
+
# Background gradient.
gradient_start = "DarkGreen"
gradient_end = "ForestGreen"
-
+
# Background image.
# image = "image.jpg" # Name of image in `static/img/`.
# image_darken = 0.6 # Darken the image? Range 0-1 where 0 is transparent and 1 is opaque.
# image_size = "cover" # Options are `cover` (default), `contain`, or `actual` size.
# image_position = "center" # Options include `left`, `center` (default), or `right`.
# image_parallax = true # Use a fun parallax-like fixed background effect? true/false
-
+
# Text color (true=light or false=dark).
text_color_light = true
@@ -43,9 +43,9 @@ subtitle = ""
padding = ["20px", "0", "20px", "0"]
[advanced]
- # Custom CSS.
+ # Custom CSS.
css_style = ""
-
+
# CSS class.
css_class = ""
+++
@@ -66,10 +66,10 @@ For inspiration, check out [the Markdown files](https://sourcethemes.com/academi
- [Decorate your laptop or journal with an Academic sticker](https://www.redbubble.com/people/neutreno/works/34387919-academic)
- [Wear the T-shirt](https://academic.threadless.com/)
-{{% alert note %}}
+{{% callout note %}}
This homepage section is an example of adding [elements](https://sourcethemes.com/academic/docs/writing-markdown-latex/) to the [*Blank* widget](https://sourcethemes.com/academic/docs/widgets/).
Backgrounds can be applied to any section. Here, the *background* option is set give a *color gradient*.
**To remove this section, delete `content/home/demo.md`.**
-{{% /alert %}}
+{{% /callout %}}
diff --git a/content/home/experience.md b/content/home/experience.md
index 12d7b3dbfd5..f2b2c799567 100644
--- a/content/home/experience.md
+++ b/content/home/experience.md
@@ -2,7 +2,7 @@
# Experience widget.
widget = "experience" # See https://sourcethemes.com/academic/docs/page-builder/
headless = true # This file represents a page section.
-active = true # Activate this widget? true/false
+active = false # Activate this widget? true/false
weight = 40 # Order that this section will appear.
title = "Experience"
diff --git a/content/home/featured.md b/content/home/featured.md
index 345b6eac108..e3138392ca4 100644
--- a/content/home/featured.md
+++ b/content/home/featured.md
@@ -5,7 +5,7 @@
widget = "featured" # See https://sourcethemes.com/academic/docs/page-builder/
headless = true # This file represents a page section.
-active = true # Activate this widget? true/false
+active = false # Activate this widget? true/false
weight = 80 # Order that this section will appear.
title = "Featured Publications"
@@ -14,7 +14,7 @@ subtitle = ""
[content]
# Page type to display. E.g. post, talk, or publication.
page_type = "publication"
-
+
# Choose how much pages you would like to display (0 = all pages)
count = 0
@@ -26,7 +26,7 @@ subtitle = ""
tag = ""
category = ""
publication_type = ""
-
+
[design]
# Toggle between the various page layout types.
# 1 = List
@@ -34,31 +34,32 @@ subtitle = ""
# 3 = Card
# 4 = Citation (publication only)
view = 3
-
+
[design.background]
# Apply a background color, gradient, or image.
# Uncomment (by removing `#`) an option to apply it.
# Choose a light or dark text color by setting `text_color_light`.
# Any HTML color name or Hex value is valid.
-
+
# Background color.
# color = "navy"
-
+
# Background gradient.
# gradient_start = "DeepSkyBlue"
# gradient_end = "SkyBlue"
-
+
# Background image.
# image = "background.jpg" # Name of image in `static/img/`.
# image_darken = 0.6 # Darken the image? Range 0-1 where 0 is transparent and 1 is opaque.
# Text color (true=light or false=dark).
# text_color_light = true
-
+
[advanced]
# Custom CSS.
css_style = ""
-
+
# CSS class.
css_class = ""
+
+++
diff --git a/content/home/posts.md b/content/home/news.md
similarity index 90%
rename from content/home/posts.md
rename to content/home/news.md
index 12b3b6f3ec5..69c82f7f369 100644
--- a/content/home/posts.md
+++ b/content/home/news.md
@@ -5,18 +5,18 @@
widget = "pages" # See https://sourcethemes.com/academic/docs/page-builder/
headless = true # This file represents a page section.
active = true # Activate this widget? true/false
-weight = 60 # Order that this section will appear.
+weight = 20 # Order that this section will appear.
-title = "Recent Posts"
+title = "Recent News"
subtitle = ""
[content]
# Page type to display. E.g. post, talk, or publication.
- page_type = "post"
-
+ page_type = "news"
+
# Choose how much pages you would like to display (0 = all pages)
count = 5
-
+
# Choose how many pages you would like to offset by
offset = 0
@@ -26,10 +26,10 @@ subtitle = ""
# Filter posts by a taxonomy term.
[content.filters]
tag = ""
- category = ""
+ category = "News"
publication_type = ""
exclude_featured = false
-
+
[design]
# Toggle between the various page layout types.
# 1 = List
@@ -37,31 +37,32 @@ subtitle = ""
# 3 = Card
# 4 = Citation (publication only)
view = 2
-
+
[design.background]
# Apply a background color, gradient, or image.
# Uncomment (by removing `#`) an option to apply it.
# Choose a light or dark text color by setting `text_color_light`.
# Any HTML color name or Hex value is valid.
-
+
# Background color.
- # color = "navy"
-
+ # color = "lightgrey"
+
# Background gradient.
# gradient_start = "DeepSkyBlue"
# gradient_end = "SkyBlue"
-
+
# Background image.
# image = "background.jpg" # Name of image in `static/img/`.
# image_darken = 0.6 # Darken the image? Range 0-1 where 0 is transparent and 1 is opaque.
# Text color (true=light or false=dark).
# text_color_light = true
-
+
[advanced]
# Custom CSS.
css_style = ""
-
+
# CSS class.
css_class = ""
+
+++
diff --git a/content/home/people.md b/content/home/people.md
index 4088557713f..5f9cf04ca86 100644
--- a/content/home/people.md
+++ b/content/home/people.md
@@ -4,53 +4,56 @@
widget = "people" # See https://sourcethemes.com/academic/docs/page-builder/
headless = true # This file represents a page section.
-active = false # Activate this widget? true/false
-weight = 68 # Order that this section will appear.
+active = false # Activate this widget? true/false
+weight = 50 # Order that this section will appear.
-title = "Meet the Team"
-subtitle = ""
+title = "People"
+subtitle = "Meet the people working at the Systems Research Lab at UC Santa Cruz. Our team works with other research groups at UC Santa Cruz, as well as other universities, national labs, and industry, within and outside the USA."
[content]
# Choose which groups/teams of users to display.
# Edit `user_groups` in each user's profile to add them to one or more of these groups.
- user_groups = ["Principal Investigators",
- "Researchers",
- "Grad Students",
+ user_groups = ["Principal Investigator",
"Administration",
+ "Faculty",
+ "Researchers",
+ "Current PhD Students",
+ "Current MS Students",
"Visitors",
- "Alumni"]
+ "Graduated PhD Students"]
[design]
+ columns = "2"
# Show user's social networking links? (true/false)
- show_social = false
+ show_social = true
# Show user's interests? (true/false)
- show_interests = true
+ show_interests = false
[design.background]
# Apply a background color, gradient, or image.
# Uncomment (by removing `#`) an option to apply it.
# Choose a light or dark text color by setting `text_color_light`.
# Any HTML color name or Hex value is valid.
-
+
# Background color.
# color = "navy"
-
+
# Background gradient.
# gradient_start = "DeepSkyBlue"
# gradient_end = "SkyBlue"
-
+
# Background image.
# image = "background.jpg" # Name of image in `static/img/`.
# image_darken = 0.6 # Darken the image? Range 0-1 where 0 is transparent and 1 is opaque.
# Text color (true=light or false=dark).
# text_color_light = true
-
+
[advanced]
- # Custom CSS.
+ # Custom CSS.
css_style = ""
-
+
# CSS class.
css_class = ""
+++
diff --git a/content/home/projects.md b/content/home/projects.md
index 660aa9778e5..1cabf2e48d9 100644
--- a/content/home/projects.md
+++ b/content/home/projects.md
@@ -3,32 +3,32 @@
widget = "portfolio" # See https://sourcethemes.com/academic/docs/page-builder/
headless = true # This file represents a page section.
active = true # Activate this widget? true/false
-weight = 65 # Order that this section will appear.
+weight = 30 # Order that this section will appear.
-title = "Projects"
-subtitle = ""
+title = "Research"
+subtitle = "Google Scholar: [by year](https://scholar.google.com/citations?hl=en&user=ntU7-hwAAAAJ&view_op=list_works&sortby=pubdate), [by citations](https://scholar.google.com/citations?user=ntU7-hwAAAAJ&hl=en)"
[content]
# Page type to display. E.g. project.
page_type = "project"
-
+
# Filter toolbar (optional).
# Add or remove as many filters (`[[content.filter_button]]` instances) as you like.
# To show all items, set `tag` to "*".
# To filter by a specific tag, set `tag` to an existing tag name.
# To remove toolbar, delete/comment all instances of `[[content.filter_button]]` below.
-
+
# Default filter index (e.g. 0 corresponds to the first `[[filter_button]]` instance below).
filter_default = 0
-
+
# [[content.filter_button]]
# name = "All"
# tag = "*"
-
+
# [[content.filter_button]]
# name = "Deep Learning"
# tag = "Deep Learning"
-
+
# [[content.filter_button]]
# name = "Other"
# tag = "Demo"
@@ -42,7 +42,7 @@ subtitle = ""
# 2 = Compact
# 3 = Card
# 5 = Showcase
- view = 3
+ view = 5
# For Showcase view, flip alternate rows?
flip_alt_rows = false
@@ -52,25 +52,25 @@ subtitle = ""
# Uncomment (by removing `#`) an option to apply it.
# Choose a light or dark text color by setting `text_color_light`.
# Any HTML color name or Hex value is valid.
-
+
# Background color.
# color = "navy"
-
+
# Background gradient.
# gradient_start = "DeepSkyBlue"
# gradient_end = "SkyBlue"
-
+
# Background image.
# image = "background.jpg" # Name of image in `static/img/`.
# image_darken = 0.6 # Darken the image? Range 0-1 where 0 is transparent and 1 is opaque.
# Text color (true=light or false=dark).
# text_color_light = true
-
+
[advanced]
- # Custom CSS.
+ # Custom CSS.
css_style = ""
-
+
# CSS class.
css_class = ""
+++
diff --git a/content/home/publications.md b/content/home/publications.md
index 1d22f863c82..b42e821eb18 100644
--- a/content/home/publications.md
+++ b/content/home/publications.md
@@ -5,18 +5,18 @@
widget = "pages" # See https://sourcethemes.com/academic/docs/page-builder/
headless = true # This file represents a page section.
active = true # Activate this widget? true/false
-weight = 90 # Order that this section will appear.
+weight = 40 # Order that this section will appear.
title = "Recent Publications"
-subtitle = ""
+subtitle = "Google Scholar: [by year](https://scholar.google.com/citations?hl=en&user=ntU7-hwAAAAJ&view_op=list_works&sortby=pubdate), [by citations](https://scholar.google.com/citations?user=ntU7-hwAAAAJ&hl=en)"
[content]
# Page type to display. E.g. post, talk, or publication.
page_type = "publication"
-
+
# Choose how much pages you would like to display (0 = all pages)
count = 5
-
+
# Choose how many pages you would like to offset by
offset = 0
@@ -29,43 +29,43 @@ subtitle = ""
category = ""
publication_type = ""
exclude_featured = false
-
+
[design]
# Toggle between the various page layout types.
# 1 = List
# 2 = Compact
# 3 = Card
# 4 = Citation (publication only)
- view = 2
-
+ view = 4
+
[design.background]
# Apply a background color, gradient, or image.
# Uncomment (by removing `#`) an option to apply it.
# Choose a light or dark text color by setting `text_color_light`.
# Any HTML color name or Hex value is valid.
-
+
# Background color.
# color = "navy"
-
+
# Background gradient.
# gradient_start = "DeepSkyBlue"
# gradient_end = "SkyBlue"
-
+
# Background image.
# image = "background.jpg" # Name of image in `static/img/`.
# image_darken = 0.6 # Darken the image? Range 0-1 where 0 is transparent and 1 is opaque.
# Text color (true=light or false=dark).
# text_color_light = true
-
+
[advanced]
- # Custom CSS.
+ # Custom CSS.
css_style = ""
-
+
# CSS class.
css_class = ""
+++
-{{% alert note %}}
+{{< callout note >}}
Quickly discover relevant content by [filtering publications]({{< ref "/publication/_index.md" >}}).
-{{% /alert %}}
+{{< /callout >}}
diff --git a/content/home/skills.md b/content/home/skills.md
index 9f9f18ed5ee..3346bb5e538 100644
--- a/content/home/skills.md
+++ b/content/home/skills.md
@@ -2,7 +2,7 @@
# A Skills section created with the Featurette widget.
widget = "featurette" # See https://sourcethemes.com/academic/docs/page-builder/
headless = true # This file represents a page section.
-active = true # Activate this widget? true/false
+active = false # Activate this widget? true/false
weight = 30 # Order that this section will appear.
title = "Skills"
diff --git a/content/home/tags.md b/content/home/tags.md
index 6da5875c908..ec451d82a18 100644
--- a/content/home/tags.md
+++ b/content/home/tags.md
@@ -2,11 +2,11 @@
# Tag Cloud widget.
widget = "tag_cloud" # See https://sourcethemes.com/academic/docs/page-builder/
headless = true # This file represents a page section.
-active = true # Activate this widget? true/false
-weight = 120 # Order that this section will appear.
+active = false # Activate this widget? true/false
+weight = 43 # Order that this section will appear.
title = "Popular Topics"
-subtitle = ""
+subtitle = "In Our Publications"
[content]
# Choose the taxonomy from `config.toml` to display (e.g. tags, categories)
diff --git a/content/home/talks.md b/content/home/talks.md
index 2be025883f4..b4768083d66 100644
--- a/content/home/talks.md
+++ b/content/home/talks.md
@@ -4,7 +4,7 @@
widget = "pages" # See https://sourcethemes.com/academic/docs/page-builder/
headless = true # This file represents a page section.
-active = true # Activate this widget? true/false
+active = false # Activate this widget? true/false
weight = 70 # Order that this section will appear.
title = "Recent & Upcoming Talks"
@@ -12,11 +12,11 @@ subtitle = ""
[content]
# Page type to display. E.g. post, talk, or publication.
- page_type = "talk"
-
+ page_type = "event"
+
# Choose how much pages you would like to display (0 = all pages)
count = 5
-
+
# Choose how many pages you would like to offset by
offset = 0
@@ -31,7 +31,7 @@ subtitle = ""
exclude_featured = false
exclude_past = false
exclude_future = false
-
+
[design]
# Toggle between the various page layout types.
# 1 = List
@@ -39,7 +39,7 @@ subtitle = ""
# 3 = Card
# 4 = Citation (publication only)
view = 2
-
+
[design.background]
# Apply a background color, gradient, or image.
# Uncomment (by removing `#`) an option to apply it.
@@ -48,22 +48,22 @@ subtitle = ""
# Background color.
# color = "navy"
-
+
# Background gradient.
# gradient_start = "DeepSkyBlue"
# gradient_end = "SkyBlue"
-
+
# Background image.
# image = "background.jpg" # Name of image in `static/img/`.
# image_darken = 0.6 # Darken the image? Range 0-1 where 0 is transparent and 1 is opaque.
# Text color (true=light or false=dark).
# text_color_light = true
-
+
[advanced]
- # Custom CSS.
+ # Custom CSS.
css_style = ""
-
+
# CSS class.
css_class = ""
+++
diff --git a/content/home/teaching.md b/content/home/teaching.md
new file mode 100644
index 00000000000..1e47a6fd3d5
--- /dev/null
+++ b/content/home/teaching.md
@@ -0,0 +1,86 @@
++++
+widget = "blank"
+headless = true # This file represents a page section.
+active = true # Activate this widget? true/false
+weight = 45 # Order that this section will appear.
+
+title = "Teaching"
+subtitle = "![Person Lecturing](img/teacher.png)"
+
+[design]
+ # Choose how many columns the section has. Valid values: 1 or 2.
+ columns = "2"
+
+[design.background]
+ # Apply a background color, gradient, or image.
+ # Uncomment (by removing `#`) an option to apply it.
+ # Choose a light or dark text color by setting `text_color_light`.
+ # Any HTML color name or Hex value is valid.
+
+ # Background color.
+ # color = "navy"
+
+ # Background gradient.
+ # gradient_start = "DeepSkyBlue"
+ # gradient_end = "SkyBlue"
+
+ # Background image.
+ # image = "teacher.png" # Name of image in `static/img/`.
+
+ # image_darken = 0 # Darken the image? Range 0-1 where 0 is transparent and 1 is opaque.
+ # image_size = "actual" # Options are `cover` (default), `contain`, or `actual` size.
+ # image_position = "left" # Options include `left`, `center` (default), or `right`.
+ # image_parallax = false # Use a fun parallax-like fixed background effect? true/false
+
+ # Text color (true=light or false=dark).
+ # text_color_light = true
+
+[advanced]
+ # Custom CSS.
+ css_style = ""
+
+ # CSS class.
+ css_class = ""
++++
+
+Spring’19: [CMPS 229](https://courses.soe.ucsc.edu/courses/cmps229/Spring19/01) (Storage Systems)
+([blog](https://users.soe.ucsc.edu/~carlosm/cmps229.spring19/Blog/Blog.html), [overview](https://users.soe.ucsc.edu/~carlosm/cmps229.spring19/Overview.html), [schedule](https://users.soe.ucsc.edu/~carlosm/cmps229.spring19/Schedule.html), [canvas](https://canvas.ucsc.edu/courses/22042))
+
+Winter’19: [CMPS 107](https://courses.soe.ucsc.edu/courses/cmps107/Winter19/01) (Open Source Programming)
+([overview](https://sites.google.com/ucsc.edu/cmps107w19/home), [google classroom](https://classroom.google.com/c/MjcyMzA3NjQwODla))
+
+Spring’17: [CMPS 229](https://courses.soe.ucsc.edu/courses/cmps229/Spring17/01) (Storage Systems)
+([blog](http://www.cs.ucsc.edu/%7Ecarlosm/cmps229.spring17/Blog/Blog.html), [overview](http://www.cs.ucsc.edu/%7Ecarlosm/cmps229.spring17/Overview.html), [schedule](http://www.cs.ucsc.edu/%7Ecarlosm/cmps229.spring17/Schedule.html), [google classroom](http://classroom.google.com/c/NDk0NzE0NDIzNVpa))
+
+Winter ’17: [CMPS 107](https://courses.soe.ucsc.edu/courses/cmps107/Winter17/01) (Open Source Programming)
+([overview](https://sites.google.com/a/ucsc.edu/open-source-programming/), [google classroom](https://classroom.google.com/c/MzkzNzgwOTc1NVpa))
+
+Winter ’16: [CMPS 107](https://courses.soe.ucsc.edu/courses/cmps107/Winter16/01) (Open Source Programming)
+([overview](https://sites.google.com/a/ucsc.edu/open-source-programming/), [google classroom](http://classroom.google.com/c/NzU0MTI0NjUx))
+
+Fall ’15: CMPS 232 (Distributed Systems)
+([blog](http://www.cs.ucsc.edu/%7Ecarlosm/cmps232.fall15/Blog/Blog.html), [overview](http://www.cs.ucsc.edu/%7Ecarlosm/cmps232.fall15/Overview.html), [schedule](http://www.cs.ucsc.edu/%7Ecarlosm/cmps232.fall15/Schedule.html))
+
+Spring ’15: CMPS 232 (Distributed Systems)
+([blog](http://www.cs.ucsc.edu/%7Ecarlosm/cmps232.spring15/Blog/Blog.html), [overview](http://www.cs.ucsc.edu/%7Ecarlosm/cmps232.spring15/Overview.html), [schedule](http://www.cs.ucsc.edu/%7Ecarlosm/cmps232.spring15/Schedule.html))
+
+Spring ’14: [CMPS 229](http://www.cs.ucsc.edu/%7Ecarlosm/cmps229.spring14/Blog/Blog.html) (Storage Systems)
+([blog](http://www.cs.ucsc.edu/%7Ecarlosm/cmps229.spring14/Blog/Blog.html), [overview](http://www.cs.ucsc.edu/%7Ecarlosm/cmps229.spring14/Overview.html), [schedule](http://www.cs.ucsc.edu/%7Ecarlosm/cmps229.spring14/Schedule.html), [speakers](http://www.cs.ucsc.edu/%7Ecarlosm/cmps229.spring14/Speakers.html))
+
+Winter ’13: [CMPS 290S](http://www.cs.ucsc.edu/%7Ecarlosm/cmps290s.winter13/Blog/Blog.html) (Big Data Systems)
+([blog](http://www.cs.ucsc.edu/%7Ecarlosm/cmps290s.winter13/Blog/Blog.html), [overview](http://www.cs.ucsc.edu/%7Ecarlosm/cmps290s.winter13/Overview.html), [schedule](http://www.cs.ucsc.edu/%7Ecarlosm/cmps290s.winter13/Schedule.html), [speakers](http://www.cs.ucsc.edu/%7Ecarlosm/cmps290s.winter13/Speakers.html))
+
+Fall ’12: [CMPS 229](http://www.cs.ucsc.edu/%7Ecarlosm/cmps229.fall12/Blog/Blog.html) (Storage Systems)
+([blog](http://www.cs.ucsc.edu/%7Ecarlosm/cmps229.fall12/Blog/Blog.html), [overview](http://www.cs.ucsc.edu/%7Ecarlosm/cmps229.fall12/Overview.html), [schedule](http://www.cs.ucsc.edu/%7Ecarlosm/cmps229.fall12/Schedule.html), speakers)
+
+Fall ’10: [CMPS 290S](http://www.soe.ucsc.edu/classes/cmps290s/Fall10/) (Coalescing Analysis & Storage)
+([blog](http://www.cs.ucsc.edu/%7Ecarlosm/cmps290s.fall10/Blog/Blog.html), [overview](http://www.cs.ucsc.edu/%7Ecarlosm/cmps290s.fall10/Overview.html), [schedule](http://www.cs.ucsc.edu/%7Ecarlosm/cmps290s.fall10/Schedule.html), [speakers](http://www.cs.ucsc.edu/%7Ecarlosm/cmps290s.fall10/Guest_Speakers.html))
+
+Fall ’08: [CMPS 290S](http://www.soe.ucsc.edu/classes/cmps290s/Fall08) (Active Storage)
+([blog](http://www.cs.ucsc.edu/%7Ecarlosm/cmps290s.fall08/Blog/Blog.html), [overview](http://www.cs.ucsc.edu/%7Ecarlosm/cmps290s.fall08/Overview.html), [schedule](http://www.cs.ucsc.edu/%7Ecarlosm/cmps290s.fall08/Schedule.html), [speakers](http://www.cs.ucsc.edu/%7Ecarlosm/cmps290s.fall08/Guest_Speakers.html))
+
+Winter ’07: [CMPS 229](http://www.soe.ucsc.edu/classes/cmps229/Winter07) (Storage Systems)
+([blog](http://www.cs.ucsc.edu/%7Ecarlosm/cmps229.winter07/Blog/Blog.html), [overview](http://www.cs.ucsc.edu/%7Ecarlosm/cmps229.winter07/Overview.html), [schedule](http://www.cs.ucsc.edu/%7Ecarlosm/cmps229.winter07/Schedule.html), [speakers](http://www.cs.ucsc.edu/%7Ecarlosm/cmps229.winter07/Guest_Speakers.html))
+
+[CMPS 280S](http://www.cs.ucsc.edu/%7Ecarlosm/cmps280s/Overview.html): Systems Research Seminar
+
diff --git a/content/news/.DS_Store b/content/news/.DS_Store
new file mode 100644
index 00000000000..377f12ef222
Binary files /dev/null and b/content/news/.DS_Store differ
diff --git a/content/news/20180103/featured.png b/content/news/20180103/featured.png
new file mode 100644
index 00000000000..911b0f2f1d2
Binary files /dev/null and b/content/news/20180103/featured.png differ
diff --git a/content/news/20180103/index.md b/content/news/20180103/index.md
new file mode 100644
index 00000000000..e1eaebcf686
--- /dev/null
+++ b/content/news/20180103/index.md
@@ -0,0 +1,30 @@
+---
+title: "Huawei joins CROSS"
+summary: "CROSS is very pleased to welcome Huawei to our Industrial Advisory Board."
+authors:
+- carlosm
+tags:
+- CROSS
+categories:
+- News
+date: "2018-01-03"
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Placement options: 1 = Full column width, 2 = Out-set, 3 = Screen-width
+# Focal point options: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight
+image:
+ placement: 2
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects: []
+---
+CROSS is very pleased to welcome Huawei to our Industrial Advisory Board.
\ No newline at end of file
diff --git a/content/news/20180128/.DS_Store b/content/news/20180128/.DS_Store
new file mode 100644
index 00000000000..5008ddfcf53
Binary files /dev/null and b/content/news/20180128/.DS_Store differ
diff --git a/content/news/20180128/featured.png b/content/news/20180128/featured.png
new file mode 100644
index 00000000000..2ecee166c86
Binary files /dev/null and b/content/news/20180128/featured.png differ
diff --git a/content/news/20180128/index.md b/content/news/20180128/index.md
new file mode 100644
index 00000000000..f18df7b610a
--- /dev/null
+++ b/content/news/20180128/index.md
@@ -0,0 +1,36 @@
+---
+title: "Papers accepted at IPDPS, ICPE, & CCGrid"
+summary: "In January we got three papers accepted -- one at IPDPS, one at CCGrid, and one at ICPE."
+authors:
+- carlosm
+tags:
+- Publication
+categories:
+- News
+date: "2018-01-28"
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Placement options: 1 = Full column width, 2 = Out-set, 3 = Screen-width
+# Focal point options: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight
+image:
+ placement: 2
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects: []
+---
+In January we got three papers accepted -- one at IPDPS, one at CCGrid, and one at ICPE:
+
+Ivo Jimenez, Noah Watkins, Michael Sevilla, Jay Lofstead, Carlos Maltzahn, “quiho: Automated Performance Regression Testing Using Inferred Resource Utilization Profiles,” *9th ACM/SPEC International Conference on Performance Engineering (ICPE 2018)*, Berlin, Germany, April 9-13, 2018.
+
+Michael Sevilla, Carlos Maltzahn, Peter Alvaro, Reza Nasirigerdeh, Bradley Settlemyer, Danny Perez, David Rich and Galen Shipman, “Programmable Caches with a Data Management Language & Policy Engine,” *18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2018)*, Washington, DC, May 1-4, 2018.
+
+Michael Sevilla, Ivo Jimenez, Noah Watkins, Jeff LeFevre, Shel Finkelstein, Peter Alvaro, Patrick Donnelly and Carlos Maltzahn, “Cudele: An API and Framework for Programmable Consistency and Durability in a Global Namespace,” *32nd IEEE International Parallel and Distributed Processing Symposium (IPDPS 2018)*, Vancouver, BC, Canada, May 21-25, 2018.
\ No newline at end of file
diff --git a/content/news/20180129/.DS_Store b/content/news/20180129/.DS_Store
new file mode 100644
index 00000000000..5008ddfcf53
Binary files /dev/null and b/content/news/20180129/.DS_Store differ
diff --git a/content/news/20180129/featured.png b/content/news/20180129/featured.png
new file mode 100644
index 00000000000..3379c381179
Binary files /dev/null and b/content/news/20180129/featured.png differ
diff --git a/content/news/20180129/index.md b/content/news/20180129/index.md
new file mode 100644
index 00000000000..3e0124d6cd7
--- /dev/null
+++ b/content/news/20180129/index.md
@@ -0,0 +1,32 @@
+---
+title: "Congratulations, Dr. Ionkov!"
+summary: 'Please join Scott Brandt, Maya Gokhale, Katia Obraczka, and me in congratulating Dr. Latchesar Ionkov on his successful Ph.D. defense today on "Optimizing Access to Scientific Data for Storage, Analysis and Visualization".'
+authors:
+- carlosm
+tags:
+- Reproducibility
+categories:
+- News
+date: "2018-01-29"
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Placement options: 1 = Full column width, 2 = Out-set, 3 = Screen-width
+# Focal point options: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight
+image:
+ placement: 2
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects: []
+---
+Please join Scott Brandt, Maya Gokhale, Katia Obraczka, and me in congratulating Dr. Latchesar Ionkov on his successful Ph.D. defense today on "Optimizing Access to Scientific Data for Storage, Analysis and Visualization".
+
+Well done, Lucho!
\ No newline at end of file
diff --git a/content/news/20180219/.DS_Store b/content/news/20180219/.DS_Store
new file mode 100644
index 00000000000..5008ddfcf53
Binary files /dev/null and b/content/news/20180219/.DS_Store differ
diff --git a/content/news/20180219/featured.png b/content/news/20180219/featured.png
new file mode 100644
index 00000000000..783d1d1febf
Binary files /dev/null and b/content/news/20180219/featured.png differ
diff --git a/content/news/20180219/index.md b/content/news/20180219/index.md
new file mode 100644
index 00000000000..1e9b3a0620b
--- /dev/null
+++ b/content/news/20180219/index.md
@@ -0,0 +1,30 @@
+---
+title: "Ivo Jimenez wins BSSw Fellowship"
+summary: "The DOE Exascale Computing Project recognized four Better Scientific Software Fellows this year. Ivo is the only graduate student among them."
+authors:
+- carlosm
+tags:
+- Reproducibility
+categories:
+- News
+date: "2018-02-19"
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Placement options: 1 = Full column width, 2 = Out-set, 3 = Screen-width
+# Focal point options: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight
+image:
+ placement: 2
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects: []
+---
+The DOE [Exascale Computing Project](https://www.exascaleproject.org/) recognized [four Better Scientific Software Fellows](https://bssw.io/blog_posts/introducing-the-2018-bssw-fellows) this year. Ivo is the only graduate student among them -- for his work on [Popper](http://falsifiable.us/), as part of the NSF-funded [Big Weather Web project](http://bigweatherweb.org/Big_Weather_Web/Home/Home.html) and the Center for Research in Open Source Software ([cross.ucsc.edu](https://cross.ucsc.edu)). More details at [UC Santa Cruz news](https://news.ucsc.edu/2018/02/bssw-fellow.html).
\ No newline at end of file
diff --git a/content/news/20180424/.DS_Store b/content/news/20180424/.DS_Store
new file mode 100644
index 00000000000..5008ddfcf53
Binary files /dev/null and b/content/news/20180424/.DS_Store differ
diff --git a/content/news/20180424/featured.png b/content/news/20180424/featured.png
new file mode 100644
index 00000000000..87b1c6e7782
Binary files /dev/null and b/content/news/20180424/featured.png differ
diff --git a/content/news/20180424/index.md b/content/news/20180424/index.md
new file mode 100644
index 00000000000..66e66326dd1
--- /dev/null
+++ b/content/news/20180424/index.md
@@ -0,0 +1,42 @@
+---
+title: "CROSS Welcomes 2 GSoC Students"
+summary: "CROSS is honored to be a mentor organization for two of the 2018 Google Summer of Code (GSoC) students. GSoC selected 1,264 students from 64 different countries working with over 200 open source projects."
+authors:
+- carlosm
+tags:
+- CROSS
+categories:
+- News
+date: "2018-04-24"
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Placement options: 1 = Full column width, 2 = Out-set, 3 = Screen-width
+# Focal point options: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight
+image:
+ placement: 2
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects: []
+---
+CROSS is honored to be a mentor organization for two of the 2018 Google Summer of Code (GSoC) students. GSoC selected 1,264 students from 64 different countries working with over 200 open source projects.
+
+The two CROSS projects are:
+
+**Title**: Archiving with Popper CLI
+**GSoC Student**: Ankan Poddar (National Institute of Technology, Durgapur, West Bengal, India)
+**Mentors**: Ivo Jimenez, Michael Sevilla
+**Abstract**: The project involves the implementation of a popper sub-command archive to create an online archive (snapshot) of the repository at any point of time and optionally generate a Digital Object Identifier (DOI). Zenodo, Figshare, Open Science Framework (OSF) and Dataverse all provide public APIs which can be used to create online archives on these services and get a DOI.
+
+**Title**: Zlog Entry Caching & Benchmarking
+**GSoC Student**: Javier Ron (Escuela Superior Politécnica del Litoral - ESPOL, Guayaquil, Ecuador)
+**Mentors**: Noah Watkins, Jeff LeFevre
+**Abstract**: Create a flexible caching service that can function locally on a zlog client node, while forming the basis for cache management in a proxy caching server. Finding and building a benchmark tool for ZLog. Run and report benchmark tests using the ZLog benchmark tool.
\ No newline at end of file
diff --git a/content/news/20180425/.DS_Store b/content/news/20180425/.DS_Store
new file mode 100644
index 00000000000..5008ddfcf53
Binary files /dev/null and b/content/news/20180425/.DS_Store differ
diff --git a/content/news/20180425/featured.png b/content/news/20180425/featured.png
new file mode 100644
index 00000000000..c38d2c2a62e
Binary files /dev/null and b/content/news/20180425/featured.png differ
diff --git a/content/news/20180425/index.md b/content/news/20180425/index.md
new file mode 100644
index 00000000000..6c70fbcfcf5
--- /dev/null
+++ b/content/news/20180425/index.md
@@ -0,0 +1,32 @@
+---
+title: "Congratulations, Dr. Sevilla!"
+summary: 'Please join Peter Alvaro, Scott Brandt, Ike Nassi, and me in congratulating Dr. Michael Sevilla on his successful Ph.D. defense on "Scalable, Global Namespaces with Programmable Storage".'
+authors:
+- carlosm
+tags:
+- Presentation
+categories:
+- News
+date: "2018-04-25"
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Placement options: 1 = Full column width, 2 = Out-set, 3 = Screen-width
+# Focal point options: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight
+image:
+ placement: 2
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects: []
+---
+Please join Peter Alvaro, Scott Brandt, Ike Nassi, and me in congratulating Dr. Michael Sevilla on his successful Ph.D. defense today on "Scalable, Global Namespaces with Programmable Storage".
+
+Well done, Michael!
\ No newline at end of file
diff --git a/content/news/20180606/.DS_Store b/content/news/20180606/.DS_Store
new file mode 100644
index 00000000000..5008ddfcf53
Binary files /dev/null and b/content/news/20180606/.DS_Store differ
diff --git a/content/news/20180606/featured.png b/content/news/20180606/featured.png
new file mode 100644
index 00000000000..24d2799f4c1
Binary files /dev/null and b/content/news/20180606/featured.png differ
diff --git a/content/news/20180606/index.md b/content/news/20180606/index.md
new file mode 100644
index 00000000000..bc6d666927b
--- /dev/null
+++ b/content/news/20180606/index.md
@@ -0,0 +1,30 @@
+---
+title: "Eusocial Storage Devices in USENIX ;login:"
+summary: "By Philip Kufeldt, Carlos Maltzahn, Tim Feldman, Christine Green, Grant Mackey, and Shingo Tanaka."
+authors:
+- carlosm
+tags:
+- Publication
+categories:
+- News
+date: "2018-06-06"
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Placement options: 1 = Full column width, 2 = Out-set, 3 = Screen-width
+# Focal point options: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight
+image:
+ placement: 2
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects: []
+---
+Philip Kufeldt, Carlos Maltzahn, Tim Feldman, Christine Green, Grant Mackey, and Shingo Tanaka, “[Eusocial storage devices - offloading data management to storage devices that can act collectively](https://drive.google.com/file/d/1WzEZr0Xzn0c7ke3Mpt8mqa1J_tEU58-D/view?usp=sharing),” *;login: The USENIX Magazine* **43** (2018), no. 2, 16–22.
\ No newline at end of file
diff --git a/content/news/20180619/.DS_Store b/content/news/20180619/.DS_Store
new file mode 100644
index 00000000000..5008ddfcf53
Binary files /dev/null and b/content/news/20180619/.DS_Store differ
diff --git a/content/news/20180619/featured.png b/content/news/20180619/featured.png
new file mode 100644
index 00000000000..6556fbef99c
Binary files /dev/null and b/content/news/20180619/featured.png differ
diff --git a/content/news/20180619/index.md b/content/news/20180619/index.md
new file mode 100644
index 00000000000..84412e83ea4
--- /dev/null
+++ b/content/news/20180619/index.md
@@ -0,0 +1,34 @@
+---
+title: "Congratulations, Dr. Watkins!"
+summary: 'Please join Peter Alvaro, Scott Brandt, and me in congratulating Dr. Noah Watkins on his successful Ph.D. defense today on "Programmable Storage".'
+authors:
+- carlosm
+tags:
+- Presentation
+categories:
+- News
+date: "2018-06-19"
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Placement options: 1 = Full column width, 2 = Out-set, 3 = Screen-width
+# Focal point options: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight
+image:
+ placement: 2
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects:
+- programmable-storage
+- declstore
+---
+Please join Peter Alvaro, Scott Brandt, and me in congratulating Dr. Noah Watkins on his successful Ph.D. defense today on "Programmable Storage".
+
+Well done, Noah!
diff --git a/content/news/20180720/.DS_Store b/content/news/20180720/.DS_Store
new file mode 100644
index 00000000000..5008ddfcf53
Binary files /dev/null and b/content/news/20180720/.DS_Store differ
diff --git a/content/news/20180720/featured.png b/content/news/20180720/featured.png
new file mode 100644
index 00000000000..6f485e12f50
Binary files /dev/null and b/content/news/20180720/featured.png differ
diff --git a/content/news/20180720/index.md b/content/news/20180720/index.md
new file mode 100644
index 00000000000..20260e92406
--- /dev/null
+++ b/content/news/20180720/index.md
@@ -0,0 +1,37 @@
+---
+title: "Paper accepted at OSDI"
+summary: "Congratulations to everyone: two of Ryan Stutsman and Robert Ricci’s (University of Utah) OSDI submissions got accepted, one of which we co-authored."
+authors:
+- carlosm
+tags:
+- Publication
+categories:
+- News
+date: "2018-07-20"
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Placement options: 1 = Full column width, 2 = Out-set, 3 = Screen-width
+# Focal point options: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight
+image:
+ placement: 2
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects:
+- practical-reproducibility
+---
+Congratulations to everyone: two of Ryan Stutsman and Robert Ricci’s (University of Utah) OSDI submissions got accepted, one of which we co-authored:
+
+Aleksander Maricq, Dmitry Duplyakin, Ivo Jimenez, Carlos Maltzahn, Ryan Stutsman, and Robert Ricci, “Taming performance variability,” *OSDI’18* (Carlsbad, CA), October 8-10 2018.
+
+**Abstract:** The performance of compute hardware varies: software run repeatedly on the same server (or a different server with supposedly identical parts) can produce performance results that differ with each execution. This variation has important effects on the reproducibility of systems research and ability to quantitatively compare the performance of different systems. It also has implications for commercial computing, where agreements are often made conditioned on meeting specific performance targets.
+
+Over a period of 10 months, we conducted a large-scale study capturing performance samples from 835 servers comprising nearly 900,000 data points. We examine this data from two perspectives: that of a service provider wishing to offer a consistent environment, and that of a systems researcher who must understand how variability impacts experimental results. From this examination, we draw a number of lessons about the types and magnitudes of performance variability and the effects on confidence in experiment results. We also create a statistical model that can be used to understand how representative an individual server is of the general population. The full dataset and our analysis tools are publicly available, and we have built a system to interactively explore the data and make recommendations for experiment parameters based on rigorous statistics.
diff --git a/content/news/20180823/.DS_Store b/content/news/20180823/.DS_Store
new file mode 100644
index 00000000000..5008ddfcf53
Binary files /dev/null and b/content/news/20180823/.DS_Store differ
diff --git a/content/news/20180823/featured.png b/content/news/20180823/featured.png
new file mode 100644
index 00000000000..3f340c75188
Binary files /dev/null and b/content/news/20180823/featured.png differ
diff --git a/content/news/20180823/index.md b/content/news/20180823/index.md
new file mode 100644
index 00000000000..5a89c0d090f
--- /dev/null
+++ b/content/news/20180823/index.md
@@ -0,0 +1,33 @@
+---
+title: "New NSF Award: Declarative Storage"
+summary: "Peter Alvaro and I received a new CSR/Medium NSF Award to explore Declarative Programmable Storage. The award is a total of $850,000 over three years."
+authors:
+- carlosm
+tags:
+- Funding
+categories:
+- News
+date: "2018-08-23"
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Placement options: 1 = Full column width, 2 = Out-set, 3 = Screen-width
+# Focal point options: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight
+image:
+ placement: 2
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects:
+- declstore
+---
+Peter Alvaro and I received a new CSR/Medium NSF Award to explore Declarative Programmable Storage. The award is a total of $850,000 over three years. More details:
+
+Noah Watkins, Michael Sevilla, Ivo Jimenez, Kathryn Dahlgren, Peter Alvaro, Shel Finkelstein, Carlos Maltzahn, “[DeclStore: Layering is for the Faint of Heart](https://www.usenix.org/system/files/conference/hotstorage17/hotstorage17-paper-watkins.pdf),” 9th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage’17) co-located with USENIX ATC’17, Santa Clara, CA, July 10-11, 2017
diff --git a/content/news/20180905/.DS_Store b/content/news/20180905/.DS_Store
new file mode 100644
index 00000000000..5008ddfcf53
Binary files /dev/null and b/content/news/20180905/.DS_Store differ
diff --git a/content/news/20180905/featured.png b/content/news/20180905/featured.png
new file mode 100644
index 00000000000..1aa92da4ef7
Binary files /dev/null and b/content/news/20180905/featured.png differ
diff --git a/content/news/20180905/index.md b/content/news/20180905/index.md
new file mode 100644
index 00000000000..4df16fd2727
--- /dev/null
+++ b/content/news/20180905/index.md
@@ -0,0 +1,33 @@
+---
+title: "NSF Award to help establish IRIS-HEP"
+summary: "I very much look forward to have us work with Princeton University on an NSF-funded 5-year project to establish the Institute for Research and Innovation in Software for High Energy Physics (IRIS-HEP)."
+authors:
+- carlosm
+tags:
+- Event
+categories:
+- News
+date: "2018-09-06"
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Placement options: 1 = Full column width, 2 = Out-set, 3 = Screen-width
+# Focal point options: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight
+image:
+ placement: 2
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects:
+- programmable-storage
+- declstore
+- practical-reproducibility
+---
+I very much look forward to have us work with Princeton University on an NSF-funded 5-year project to establish the Institute for Research and Innovation in Software for High Energy Physics ([IRIS-HEP](http://iris-hep.org/)). This effort is a perfect complement to the existing NSF grants for Programmable Storage (with Dirk Grunwald at CU Boulder) and Declarative Programmable Storage (with Peter Alvaro). For more details, see [news release](https://news.ucsc.edu/2018/09/iris-hep-grant.html).
diff --git a/content/news/20180920/.DS_Store b/content/news/20180920/.DS_Store
new file mode 100644
index 00000000000..5008ddfcf53
Binary files /dev/null and b/content/news/20180920/.DS_Store differ
diff --git a/content/news/20180920/featured.png b/content/news/20180920/featured.png
new file mode 100644
index 00000000000..8563cd4ce6a
Binary files /dev/null and b/content/news/20180920/featured.png differ
diff --git a/content/news/20180920/index.md b/content/news/20180920/index.md
new file mode 100644
index 00000000000..3538342869c
--- /dev/null
+++ b/content/news/20180920/index.md
@@ -0,0 +1,42 @@
+---
+title: "Register for upcoming CROSS Symposium!"
+summary: "On October 3-4, 2018, the Center for Research in Open Source Software (CROSS) will host its 3rd Annual Research Symposium at the Baskin School of Engineering, University of California, Santa Cruz."
+authors:
+- carlosm
+tags:
+- Event
+categories:
+- News
+date: "2018-09-20"
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Placement options: 1 = Full column width, 2 = Out-set, 3 = Screen-width
+# Focal point options: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight
+image:
+ placement: 2
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects: []
+---
+On October 3-4, 2018, [the Center for Research in Open Source Software (CROSS)](https://cross.ucsc.edu/) will host its [3rd Annual Research Symposium](https://cross.ucsc.edu/2018-symposium/) at the Baskin School of Engineering, University of California, Santa Cruz.
+
+Check out our [preliminary agenda](https://cross.ucsc.edu/2018-symposium/), including keynotes by [Cat Allman](https://www.linkedin.com/in/catallman/) (Google), [AnHai Doan](http://pages.cs.wisc.edu/~anhai/) (Univ. Wisconsin-Madison), [Jay Kreps](https://www.linkedin.com/in/jaykreps/) (Confluent).
+
+[Registration is now open](https://cross-registration.soe.ucsc.edu/registration/4) until September 25. The registration fee is $299.
+
+This two-day Symposium will provide an opportunity to learn about the research and incubator projects at CROSS, interact with CROSS faculty, graduate students, and affiliated researchers, and discuss future directions and collaborative research projects at UC Santa Cruz. The program includes plenary sessions with keynotes and lightening talks, a two-track workshop program, and our traditional, can't-miss Oktoberfest dinner on the evening of the first day.
+
+We look forward to seeing you at the Symposium!
+
+[Carlos Maltzahn](https://users.soe.ucsc.edu/~carlosm/UCSC/Home/Home.html)
+
+P.S.: If you are unfamiliar with CROSS, check out [our flyer](https://users.soe.ucsc.edu/~carlosm/2018CrossOnePage.html), visit [cross.ucsc.edu](https://cross.ucsc.edu/), or [contact us](mailto:cross-info@ucsc.edu).
\ No newline at end of file
diff --git a/content/news/20181206/.DS_Store b/content/news/20181206/.DS_Store
new file mode 100644
index 00000000000..5008ddfcf53
Binary files /dev/null and b/content/news/20181206/.DS_Store differ
diff --git a/content/news/20181206/featured.png b/content/news/20181206/featured.png
new file mode 100644
index 00000000000..167b0866d00
Binary files /dev/null and b/content/news/20181206/featured.png differ
diff --git a/content/news/20181206/index.md b/content/news/20181206/index.md
new file mode 100644
index 00000000000..e642ab46891
--- /dev/null
+++ b/content/news/20181206/index.md
@@ -0,0 +1,32 @@
+---
+title: "CROSS Incubator Project: Skyhook DM"
+summary: "Check out Skyhook Data Management, one of the CROSS incubator projects."
+authors:
+- carlosm
+tags:
+- Project
+categories:
+- News
+date: "2018-12-06"
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Placement options: 1 = Full column width, 2 = Out-set, 3 = Screen-width
+# Focal point options: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight
+image:
+ placement: 2
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects:
+- declstore
+- programmable-storage
+---
+Check out [Skyhook Data Management](http://skyhookdm.com), one of the CROSS incubator projects. It’s leveraging Ceph’s programmable storage features to make PostgreSQL and other databases more scalable and elastic.
diff --git a/content/news/20190206/.DS_Store b/content/news/20190206/.DS_Store
new file mode 100644
index 00000000000..5008ddfcf53
Binary files /dev/null and b/content/news/20190206/.DS_Store differ
diff --git a/content/news/20190206/featured.png b/content/news/20190206/featured.png
new file mode 100644
index 00000000000..61994d37a46
Binary files /dev/null and b/content/news/20190206/featured.png differ
diff --git a/content/news/20190206/index.md b/content/news/20190206/index.md
new file mode 100644
index 00000000000..7fc97f4b021
--- /dev/null
+++ b/content/news/20190206/index.md
@@ -0,0 +1,35 @@
+---
+title: "Congratulations, Dr. Jimenez!"
+summary: 'Please join Peter Alvaro, Scott Brandt, Jay Lofstead, and me in congratulating Dr. Ivo Jimenez on his successful Ph.D. defense today on "Agile Research Delivery".'
+authors:
+- carlosm
+tags:
+- Presentation
+categories:
+- News
+date: "2019-02-06"
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Placement options: 1 = Full column width, 2 = Out-set, 3 = Screen-width
+# Focal point options: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight
+image:
+ placement: 2
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects:
+- practical-reproducibility
+---
+Please join Peter Alvaro, Scott Brandt, Jay Lofstead, and me in congratulating Dr. Ivo Jimenez on his successful Ph.D. defense today on "Agile Research Delivery".
+
+Well done, Ivo!
+
+(The photo above are from the notes of Ivo’s 7-year-old daughter she took during his defense.)
diff --git a/content/news/20190214/.DS_Store b/content/news/20190214/.DS_Store
new file mode 100644
index 00000000000..5008ddfcf53
Binary files /dev/null and b/content/news/20190214/.DS_Store differ
diff --git a/content/news/20190214/featured.png b/content/news/20190214/featured.png
new file mode 100644
index 00000000000..c89f87ced62
Binary files /dev/null and b/content/news/20190214/featured.png differ
diff --git a/content/news/20190214/index.md b/content/news/20190214/index.md
new file mode 100644
index 00000000000..5d99d3ca530
--- /dev/null
+++ b/content/news/20190214/index.md
@@ -0,0 +1,35 @@
+---
+title: "Philip Kufeldt on Composable Infrastructure"
+summary: "CROSS Industry-Practitioner-in-Residence Philip Kufeldt contributed to the SNIA Live Webcast on [Why Composable Infrastructure](https://www.brighttalk.com/webcast/663/344762/why-composable-infrastructure)"
+authors:
+- carlosm
+tags:
+- Presentation
+categories:
+- News
+date: "2019-02-14"
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Placement options: 1 = Full column width, 2 = Out-set, 3 = Screen-width
+# Focal point options: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight
+image:
+ placement: 2
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects:
+- eusocial-storage
+---
+CROSS Industry-Practitioner-in-Residence Philip Kufeldt contributed to the SNIA Live Webcast on “[Why Composable Infrastructure](https://www.brighttalk.com/webcast/663/344762/why-composable-infrastructure)”, together with Mike Jochimsen (Kaminario) and hosted by Alex McDonald (NetApp).
+
+The key takeaway for me is that increasing heterogeneity of components due to the end of Moore’s Law will make hardware configurations of systems harder, especially for an increased variety of workloads -- unless these systems are dynamically reconfigurable or “composable” on the physical level. This will drive disaggregation which in turn will require components that attach to a full cross-bar network. In our CROSS research project on “[Eusocial Storage Devices](https://users.soe.ucsc.edu/~carlosm/UCSC/Research/Entries/2018/7/20_Eusocial_Storage_Devices.html)” we are looking not only at the design of such components but also considering the fact that those components can communicate among each other and therefore act collectively in a coordinated fashion.
+
+Don’t miss the excellent [Q&A for this session](http://www.sniacloud.com/composable-infrastructure-qa/).
diff --git a/content/news/20190316/.DS_Store b/content/news/20190316/.DS_Store
new file mode 100644
index 00000000000..5008ddfcf53
Binary files /dev/null and b/content/news/20190316/.DS_Store differ
diff --git a/content/news/20190316/featured.png b/content/news/20190316/featured.png
new file mode 100644
index 00000000000..8eeb8a20b66
Binary files /dev/null and b/content/news/20190316/featured.png differ
diff --git a/content/news/20190316/index.md b/content/news/20190316/index.md
new file mode 100644
index 00000000000..323a39966e5
--- /dev/null
+++ b/content/news/20190316/index.md
@@ -0,0 +1,34 @@
+---
+title: "Presentation at OSLS 2019"
+summary: "The Linux Foundation invited me to the Open Source Leadership Summit that this year, very conveniently, took place in Half Moon Bay."
+authors:
+- carlosm
+tags:
+- Presentation
+categories:
+- News
+date: "2019-03-21"
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Placement options: 1 = Full column width, 2 = Out-set, 3 = Screen-width
+# Focal point options: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight
+image:
+ placement: 2
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects: []
+---
+The Linux Foundation invited me to the [Open Source Leadership Summit](https://events.linuxfoundation.org/events/open-source-leadership-summit-2019/) that this year, very conveniently, took place in Half Moon Bay.
+
+I was selected to present on how to leverage research universities ([pdf](https://events.linuxfoundation.org/wp-content/uploads/2018/07/How-to-Leverage-Research-Universities.pdf), [prezi](https://prezi.com/view/eRW7M8E5pkLZRGNur0AN)). Abstract: “Once Ph.D. students graduate, they tend to throw away what are often pretty amazing software infrastructures they built as part of their research project. One of the exceptions is Ceph because Sage Weil was able to build a community around what was 12 years ago just a research prototype. To enable more students to have a similar career as Sage, I founded the Center for Research in Open Source Software (CROSS) to offer students a career path to OSS leadership. Now sustained by six industry members, CROSS is funding three incubator and three research projects, and teaches an undergraduate course on how to submit Linux kernel patches.”
+
+The event was amazing: it was great to see friends again, some of whom had a great influence on the design of CROSS, and to meet people with exciting new perspectives. All [the keynotes](https://www.youtube.com/playlist?list=PLbzoR-pLrL6qAgIuy5ft7CNWD7UQ4XdIS) are online and [the slides](https://events.linuxfoundation.org/events/open-source-leadership-summit-2019/program/slides/) of most of the presentations are as well. It all took place at the beautiful Ritz Carlton Half Moon Bay, food and desert had plenty of gluten-free options, and, something I had never seen before, there was soy milk and oats milk next to the coffee!
\ No newline at end of file
diff --git a/content/news/20190321/.DS_Store b/content/news/20190321/.DS_Store
new file mode 100644
index 00000000000..5008ddfcf53
Binary files /dev/null and b/content/news/20190321/.DS_Store differ
diff --git a/content/news/20190321/featured.png b/content/news/20190321/featured.png
new file mode 100644
index 00000000000..45518d567fa
Binary files /dev/null and b/content/news/20190321/featured.png differ
diff --git a/content/news/20190321/index.md b/content/news/20190321/index.md
new file mode 100644
index 00000000000..1f494d23cd5
--- /dev/null
+++ b/content/news/20190321/index.md
@@ -0,0 +1,33 @@
+---
+title: "Data Storage Research Vision 2025"
+summary: "I was very honored to be part of last year’s Data Storage Research Vision 2025 NSF Workshop"
+authors:
+- carlosm
+tags:
+- Publication
+categories:
+- News
+date: "2019-03-21"
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Placement options: 1 = Full column width, 2 = Out-set, 3 = Screen-width
+# Focal point options: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight
+image:
+ placement: 2
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects:
+- eusocial-storage
+- programmable-storage
+- declstore
+---
+I was very honored to be part of last year’s Data Storage Research Vision 2025 NSF Workshop convened by George Amvrosiadis, Ali Butt, Vasily Tarasov, Erez Zadok, and Ming Zhao at IBM Almaden. The report is now available [here](https://dl.acm.org/citation.cfm?id=3316807).
diff --git a/content/news/20190412/.DS_Store b/content/news/20190412/.DS_Store
new file mode 100644
index 00000000000..5008ddfcf53
Binary files /dev/null and b/content/news/20190412/.DS_Store differ
diff --git a/content/news/20190412/featured.png b/content/news/20190412/featured.png
new file mode 100644
index 00000000000..c003abac60e
Binary files /dev/null and b/content/news/20190412/featured.png differ
diff --git a/content/news/20190412/index.md b/content/news/20190412/index.md
new file mode 100644
index 00000000000..9d594e1e900
--- /dev/null
+++ b/content/news/20190412/index.md
@@ -0,0 +1,36 @@
+---
+title: "Guest on Embedded.fm"
+summary: 'One of the outcomes of attending [!!Con West](http://bangbangcon.com/west/2019/) was an opportunity to join the 285th podcast of [Embedded.fm](https://www.embedded.fm/): “[A Chicken Getting to the Other Side](https://www.embedded.fm/episodes/285)”'
+authors:
+- carlosm
+tags:
+- Presentation
+categories:
+- News
+date: "2019-04-12"
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Placement options: 1 = Full column width, 2 = Out-set, 3 = Screen-width
+# Focal point options: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight
+image:
+ placement: 2
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects:
+- practical-reproducibility
+- programmable-storage
+---
+One of the outcomes of attending [!!Con West](http://bangbangcon.com/west/2019/) was an opportunity to join the 285th podcast of [Embedded.fm](https://www.embedded.fm/): “[A Chicken Getting to the Other Side](https://www.embedded.fm/episodes/285)” to talk about [CROSS](https://cross.ucsc.edu) and its three incubator fellows, [Kate Compton](https://www.galaxykate.com/), [Ivo Jimenez](https://users.soe.ucsc.edu/~ivo/), and [Jeff LeFevre](https://users.soe.ucsc.edu/~jlefevre/). If you are wondering about the choice of title, listen to the podcast!
+
+The podcast is created and hosted by Elecia White of “[Making Embedded Systems](http://amzn.to/1XxPvjR)” fame and Chris White who, in addition to being an embedded systems engineer, did all the editing of this podcast -- which made me sound so much better than in real life!
+
+I’ve never been on a podcast before. Elecia and Chris did a fantastic job of preparing me for the episode and for making this a great experience throughout.
diff --git a/content/news/20190610/.DS_Store b/content/news/20190610/.DS_Store
new file mode 100644
index 00000000000..5008ddfcf53
Binary files /dev/null and b/content/news/20190610/.DS_Store differ
diff --git a/content/news/20190610/featured.png b/content/news/20190610/featured.png
new file mode 100644
index 00000000000..88323a05545
Binary files /dev/null and b/content/news/20190610/featured.png differ
diff --git a/content/news/20190610/index.md b/content/news/20190610/index.md
new file mode 100644
index 00000000000..6f5aa4520d5
--- /dev/null
+++ b/content/news/20190610/index.md
@@ -0,0 +1,34 @@
+---
+title: "Call for Participation for SSDBM 2019"
+summary: "The 31st International Conference on Scientific and Statistical Database Management ([SSDBM 2019](https://uccross.github.io/ssdbm2019/)) will be held in Santa Cruz, CA, USA on July 23-25, 2019."
+authors:
+- carlosm
+tags:
+- Event
+categories:
+- News
+date: "2019-06-10"
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Placement options: 1 = Full column width, 2 = Out-set, 3 = Screen-width
+# Focal point options: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight
+image:
+ placement: 2
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects: []
+---
+The 31st International Conference on Scientific and Statistical Database Management ([SSDBM 2019](https://uccross.github.io/ssdbm2019/)) will be held in Santa Cruz, CA, USA on July 23-25, 2019. The event will take place on the beautiful [UC Santa Cruz campus](https://uccross.github.io/ssdbm2019/#conference-venue) among redwoods and ocean views. The [conference organizers](https://uccross.github.io/ssdbm2019/#conference-organizers) put together a [great program](https://uccross.github.io/ssdbm2019/#detailed-program) with [keynotes](https://uccross.github.io/ssdbm2019/#keynotes) by Magdalena Balazinska, Susan Davidson, and Alok Chaudhary, as well as [evening events](https://uccross.github.io/ssdbm2019/#social-events) at the historic Cowell Hay Barn and the Seymour Marine Discovery Center.
+
+[Conference registration](https://uccross.github.io/ssdbm2019/#registration) is open and the regular registration price is available until July 6. However, most [hotel reservation](https://uccross.github.io/ssdbm2019/#accommodation) blocks expire June 21.
+
+Here is a 1-page [flyer](https://drive.google.com/file/d/1eGMLpngK_0g1ah47yExfyllhXu9MbzYP/view?usp=sharing) -- feel free to forward, print, or post.
\ No newline at end of file
diff --git a/content/news/20191002/.DS_Store b/content/news/20191002/.DS_Store
new file mode 100644
index 00000000000..5008ddfcf53
Binary files /dev/null and b/content/news/20191002/.DS_Store differ
diff --git a/content/news/20191002/featured.png b/content/news/20191002/featured.png
new file mode 100644
index 00000000000..78bfed40bfe
Binary files /dev/null and b/content/news/20191002/featured.png differ
diff --git a/content/news/20191002/index.md b/content/news/20191002/index.md
new file mode 100644
index 00000000000..7aebc72adee
--- /dev/null
+++ b/content/news/20191002/index.md
@@ -0,0 +1,34 @@
+---
+title: "2019 CROSS Research Symposium"
+summary: "The [4th Annual CROSS Research Symposium](https://cross.ucsc.edu/2019-symposium) is taking place today and tomorrow with three keynote speakers, Daniela Barbosa, Joseph Jacks, and Haoyuan Li."
+authors:
+- carlosm
+tags:
+- Event
+categories:
+- News
+date: "2019-10-02"
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Placement options: 1 = Full column width, 2 = Out-set, 3 = Screen-width
+# Focal point options: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight
+image:
+ placement: 2
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects:
+- eusocial-storage
+- declstore
+- programmable-storage
+- practical-reproducibility
+---
+The [4th Annual CROSS Research Symposium](https://cross.ucsc.edu/2019-symposium) is taking place today and tomorrow with three keynote speakers, Daniela Barbosa (VP World-wide Alliances, Linux Foundation Hyperledger), Joseph Jacks (Founder and CEO, OSS Capital), and Haoyuan Li (Founder, CTO and Chairman, Alluxio), 12 sessions on topics including Reproducibility in Systems Research, Open Science & Open Access, Aspects of Data Management, Programming AI for poets, Storage Systems, Eusocial Storage Devices, Hardware Security, Open Source Hardware Flows, and Data Management within the Storage System, and the 11th UC Santa Cruz Systems Oktoberfest.
diff --git a/content/news/20191019/.DS_Store b/content/news/20191019/.DS_Store
new file mode 100644
index 00000000000..5008ddfcf53
Binary files /dev/null and b/content/news/20191019/.DS_Store differ
diff --git a/content/news/20191019/featured.png b/content/news/20191019/featured.png
new file mode 100644
index 00000000000..51962d2b72c
Binary files /dev/null and b/content/news/20191019/featured.png differ
diff --git a/content/news/20191019/index.md b/content/news/20191019/index.md
new file mode 100644
index 00000000000..655f691a4b0
--- /dev/null
+++ b/content/news/20191019/index.md
@@ -0,0 +1,34 @@
+---
+title: "15th GSoC Mentor Summit in Munich"
+summary: "In the second year of CROSS being a mentor organization for Google Summer of Code, we received Google funding for 4 summer students. For more information, see my GSoC Mentor Summit [lightening talk](https://docs.google.com/presentation/d/17a8PSOWnuoOOfv-snVEno2oIaDFEUfQ6UirhHW4k6k0/edit?usp=sharing)."
+authors:
+- carlosm
+tags:
+- Presentation
+categories:
+- News
+date: "2019-10-19"
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Placement options: 1 = Full column width, 2 = Out-set, 3 = Screen-width
+# Focal point options: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight
+image:
+ placement: 2
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects: []
+---
+In the second year of CROSS being a mentor organization for Google Summer of Code, we received Google funding for 4 summer students, Jayjeet Chakraborty (NIT Durgapur, West Bengal, India), Bárbara Galindo Dórame (Universidad de Sonora, Hermosillo, Mexico), Mohd Arshul Mansoori (Ambedkar Institute for Advanced Communication Technology & Research, Delhi, India), and Ashay Shirwadkar (UC Riverside).
+
+The students were mentored by CROSS incubator fellows Ivo Jimenez and Jeff LeFevre as well as Michael Sevilla (TidalScale), Noah Watkins (vectorized.io), and Quincy Wofford (LANL). They did great work and we were able to invite two of the students to the CROSS Symposium this year and sponsored another one to visit SC19.
+
+For more information, see my GSoC Mentor Summit [lightening talk](https://docs.google.com/presentation/d/17a8PSOWnuoOOfv-snVEno2oIaDFEUfQ6UirhHW4k6k0/edit?usp=sharing).
\ No newline at end of file
diff --git a/content/news/20191221/.DS_Store b/content/news/20191221/.DS_Store
new file mode 100644
index 00000000000..5008ddfcf53
Binary files /dev/null and b/content/news/20191221/.DS_Store differ
diff --git a/content/news/20191221/featured.png b/content/news/20191221/featured.png
new file mode 100644
index 00000000000..dd960dfcc74
Binary files /dev/null and b/content/news/20191221/featured.png differ
diff --git a/content/news/20191221/index.md b/content/news/20191221/index.md
new file mode 100644
index 00000000000..960c5b71752
--- /dev/null
+++ b/content/news/20191221/index.md
@@ -0,0 +1,33 @@
+---
+title: "Paper accepted at NSDI ’20"
+summary: "Our paper ([arxiv](https://arxiv.org/abs/1912.09256)) led by Alexandru Uta at Vrije Universiteit Amsterdam was accepted at [NSDI ’20](https://www.usenix.org/conference/nsdi20). The final version of the paper is going to be available on 2/7/20."
+authors:
+- carlosm
+tags:
+- Publication
+categories:
+- News
+date: "2019-12-21"
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Placement options: 1 = Full column width, 2 = Out-set, 3 = Screen-width
+# Focal point options: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight
+image:
+ placement: 2
+ caption: "Image credit: [USENIX NSDI '20](https://www.usenix.org/conference/nsdi20)"
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects: []
+---
+Our paper ([arxiv](https://arxiv.org/abs/1912.09256)) led by Alexandru Uta at Vrije Universiteit Amsterdam was accepted at [NSDI ’20](https://www.usenix.org/conference/nsdi20). The final version of the paper is going to be available on 2/7/20.
+
+**Abstract:** Performance variability has been acknowledged as a problem for over a decade by cloud practitioners and performance engineers. Yet, our survey of top systems conferences reveals that the research community regularly disregards variability when running experiments in the cloud. Focusing on networks, we assess the impact of variability on cloud-based big-data workloads by gathering traces from mainstream commercial clouds and private research clouds. Our data collection consists of millions of datapoints gathered while transferring over 9 petabytes of data. We characterize the network variability present in our data and show that, even though commercial cloud providers implement mechanisms for quality-of-service enforcement, variability still occurs, and is even exacerbated by such mechanisms and service provider policies. We show how big-data workloads suffer from significant slowdowns and lack predictability and replicability, even when state-of-the-art experimentation techniques are used. We provide guidelines for practitioners to reduce the volatility of big data performance, making experiments more repeatable.
\ No newline at end of file
diff --git a/content/news/20200119/featured.png b/content/news/20200119/featured.png
new file mode 100644
index 00000000000..c96b1689f72
Binary files /dev/null and b/content/news/20200119/featured.png differ
diff --git a/content/news/20200119/index.md b/content/news/20200119/index.md
new file mode 100644
index 00000000000..318ff0dfc94
--- /dev/null
+++ b/content/news/20200119/index.md
@@ -0,0 +1,39 @@
+---
+# Documentation: https://sourcethemes.com/academic/docs/managing-content/
+
+title: "New web site!"
+subtitle: "Saying Good Bye to iWeb"
+summary: "I'm retiring my old website because I won't be able to use iWeb any longer and there are now really nice, open source frameworks available."
+authors:
+- carlosm
+tags:
+- Website
+categories:
+- News
+date: 2020-01-19T14:25:02-08:00
+lastmod: 2020-01-19T14:25:02-08:00
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Focal points: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight.
+image:
+ caption: ""
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects: []
+---
+I'm retiring my [old website](https://users.soe.ucsc.edu/~carlosm/UCSC/Home/Home.html) because I won't be able to use iWeb any longer and there are now really nice, open source frameworks available.
+
+Why the switch? I’ve been using [iWeb](https://en.wikipedia.org/wiki/IWeb) to maintain this website. Even though Apple discontinued iWeb since 2011, I was able to use it productively, and I always had a million more important things to do than porting my website. Now, Apple’s macOS Catalina doesn’t run 32-bit applications, including iWeb. I considered using a virtual machine to run Mojave -- but the primary reason why I upgrade macOS is increased security. Time to switch.
+
+Luckily, just in time for the holiday season I found a really nice framework based on [Hugo](https://en.wikipedia.org/wiki/Hugo_(software)) and the [Academic theme](https://themes.gohugo.io/academic/). I love that it’s a static site generator because of it’s responsiveness, flexibility, and easy maintenance -- and it ‘s all open-source. No more dependency on a proprietary 32-bit application. Academic comes with a fantastic publications management system that allows very quick in-browser search and adds publications directly from a bibtex file (via the [academic tool](https://github.com/sourcethemes/academic-admin)).
+
+And, perhaps most importantly, others can help me keep it up-to-date because I can share the [source](https://github.com/carlosmalt/academic-kickstart) of the new website via github.
\ No newline at end of file
diff --git a/content/news/20200216/featured.png b/content/news/20200216/featured.png
new file mode 100644
index 00000000000..d6e6f745987
Binary files /dev/null and b/content/news/20200216/featured.png differ
diff --git a/content/news/20200216/index.md b/content/news/20200216/index.md
new file mode 100644
index 00000000000..540f0fe50ea
--- /dev/null
+++ b/content/news/20200216/index.md
@@ -0,0 +1,37 @@
+---
+title: "CROSS Workshop on Practical Reproducibility in Systems"
+summary: "Join Ivo Jimenez (UCSC), Marios Kogias (EPFL), Alexandru Uta
+(Leiden University) discussing aspects of practical reproducibility."
+authors:
+- carlosm
+tags:
+- cross
+- event
+categories:
+- News
+date: "2020-02-16"
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Placement options: 1 = Full column width, 2 = Out-set, 3 = Screen-width
+# Focal point options: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight
+image:
+ placement: 2
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects:
+- practical-reproducibility
+---
+Please join us at the CROSS Workshop on Practical Reproducibility in Systems Research
+on Friday, February 28, 2020 with speakers including Ivo Jimenez (UCSC),
+Marios Kogias (EPFL), Alexandru Uta (Leiden University) discussing reproducible workflows,
+microsecond latency experiments, and public cloud network performance.
+[More ...](https://cross.ucsc.edu/news/events/2020228workshop.html)
diff --git a/content/news/20200222/featured.png b/content/news/20200222/featured.png
new file mode 100644
index 00000000000..766d818f951
Binary files /dev/null and b/content/news/20200222/featured.png differ
diff --git a/content/news/20200222/featured.xcf b/content/news/20200222/featured.xcf
new file mode 100644
index 00000000000..99e7750597e
Binary files /dev/null and b/content/news/20200222/featured.xcf differ
diff --git a/content/news/20200222/index.md b/content/news/20200222/index.md
new file mode 100644
index 00000000000..c7aa2a07c69
--- /dev/null
+++ b/content/news/20200222/index.md
@@ -0,0 +1,46 @@
+---
+# Documentation: https://sourcethemes.com/academic/docs/managing-content/
+
+title: "My schedule at Vault/FAST/NSDI 2020"
+subtitle: ""
+summary: "Looking forward to meeting friends and colleagues this week. Here is my schedule."
+authors:
+- carlosm
+tags:
+- event
+- programmable
+- storage
+- reproducibility
+categories:
+- News
+date: 2020-02-22T16:12:31-08:00
+lastmod: 2020-02-22T16:12:31-08:00
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Focal points: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight.
+image:
+ caption: ""
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects:
+- programmable-storage
+- declstore
+- practical-reproducibility
+---
+
+This is going to be a busy week with Vault '20, FAST '20, and NSDI '20 all happening at the same time. Jeff LeFevre is going to give [an update on SkyhookDM at Vault'20 on Monday, 1:30pm](https://www.usenix.org/conference/vault20/presentation/lefevre), and Alex Uta is going to present [our paper on reproducibility challenges in public cloud networks at NSDI'20 on Wednesday](https://www.usenix.org/conference/nsdi20/presentation/uta), in the 11am session on Measurement and Adaptation.
+
+Below is my schedule. Unfortunately, I won't be able to attend Thursday due to travel.
+
+Looking forward to see everyone!
+
+
\ No newline at end of file
diff --git a/content/news/20200419/featured.png b/content/news/20200419/featured.png
new file mode 100644
index 00000000000..90c34a6b298
Binary files /dev/null and b/content/news/20200419/featured.png differ
diff --git a/content/news/20200419/index.md b/content/news/20200419/index.md
new file mode 100644
index 00000000000..1f0e9a2322c
--- /dev/null
+++ b/content/news/20200419/index.md
@@ -0,0 +1,41 @@
+---
+# Documentation: https://sourcethemes.com/academic/docs/managing-content/
+
+title: "CROSS Funds Project on Open Source Autonomous Vehicles"
+subtitle: ""
+summary: "With selecting the Open Source Autonomous Vehicle Controller project for funding, CROSS now plays a part in the burgeoning open source hardware movement."
+authors: [slieggi]
+tags: [cross]
+categories: [News]
+date: 2020-04-19T11:06:11-07:00
+lastmod: 2020-04-19T11:06:11-07:00
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Focal points: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight.
+image:
+ caption: ""
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects: [cross]
+---
+
+[This is a repost of a [CROSS news release](https://cross.ucsc.edu/news/news/20200415newproject.html).]
+
+After the March 2020 Industrial Advisory Board (IAB) meeting, the Center for Research in Open Source Software (CROSS) approved funding for a new research project entitled [Open Source Autonomous Vehicle Controller (OSAVC)](https://cross.ucsc.edu/projects/osavcpage.html) proposed by ECE PhD student [Aaron Hunter](https://aaronhunterblog.wordpress.com/) and his advisor Professor [Gabriel Elkaim](https://www.soe.ucsc.edu/people/elkaim). The OSAVC project, which is scheduled to begin in Summer 2020, is an open source hardware and software project that provides the link between real-time control and intelligent decision making. Although a number of CROSS-funded projects have interacted with hardware, the OSAVC project is the first CROSS funded effort that has the development of open source hardware as one of its primary goals.
+
+With the OSAVC project CROSS now funds [eight projects](https://cross.ucsc.edu/projects/index.html), three incubator projects and five research projects. CROSS projects include researchers from five different research groups at the Baskin School of Engineering (BSOE), spanning three BSOE departments -- CSE, ECE, and Computational Media.
+
+CROSS teaches students how to productively interact with open source communities, funds high-impact research projects with plausible paths to successful open source projects (such as the one pursued by Hunter), and incubates developer communities for research prototypes. The OSAVC project explores designs of an open hardware/open firmware platform that will enable the next generation of autonomous vehicles by providing a seamless link between embedded control and intelligent systems. According to Hunter, CROSS’ support for this project “allows me to focus on the project exclusively, which will lead to a shorter time to completion and a higher probability of success.”
+
+With selecting the OSAVC project for funding, CROSS now plays a part in the burgeoning open source hardware movement. While open source hardware is not as widely accepted as open software software, the importance of open source hardware solutions and technologies is expected to increase significantly in the coming years: “The open source approach allows people from all over the world to collaboratively design, test, improve hardware at unprecedented speed, and then manufacture it right there where it is needed, as the world-wide response to the COVID-19 pandemic greatly illustrates” said [Carlos Maltzahn](https://people.ucsc.edu/carlosm), founder and director of CROSS.
+
+CROSS looks forward to highlighting the work of OSAVC and other CROSS projects in the annual Research Symposium, scheduled for early October.
diff --git a/content/news/20200528/featured.jpg b/content/news/20200528/featured.jpg
new file mode 100644
index 00000000000..48bc9812e3a
Binary files /dev/null and b/content/news/20200528/featured.jpg differ
diff --git a/content/news/20200528/index.md b/content/news/20200528/index.md
new file mode 100644
index 00000000000..6d3dc12bd0e
--- /dev/null
+++ b/content/news/20200528/index.md
@@ -0,0 +1,40 @@
+---
+# Documentation: https://sourcethemes.com/academic/docs/managing-content/
+
+title: "Open Source Technology Essential in Facing COVID-19 Challenges"
+subtitle: ""
+summary: "Open source technologies, along with open science efforts, have been in the forefront of the global response, including CROSS fellows and faculty."
+authors: [slieggi]
+tags: [cross]
+categories: [News]
+date: 2020-05-28T11:16:45-07:00
+lastmod: 2020-05-28T11:16:45-07:00
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Focal points: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight.
+image:
+ caption: ""
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects: [cross]
+---
+
+[This is a repost of a [CROSS blog post](https://cross.ucsc.edu/news/blog/covid19response.html).]
+
+The COVID-19 pandemic has led to increased focus on the need for collaboration and openness to develop effective means to fight the spread of the virus and to minimize the societal damage of pandemic countermeasures. Open source technologies, along with open science efforts, have been in the forefront of the global response from the earliest stages of the COVID-19 outbreak, and crucial in tracing the coronavirus’s spread, mapping its genome, analyzing infection trends, manufacturing of needed medical equipment and serving communities most impacted by the outbreak.
+Open source contributors reacted quickly to the outbreak and, according to [one estimate](https://github.com/WeileiZeng/Open-Source-COVID-19), over 10,000 individuals have contributed to the multitude of open source projects aimed at mitigating the health and social impacts of the pandemic. A [search for COVID-19](https://github.com/topics/covid-19) on Github shows more than 3,700 repositories; a [curated list](https://github.com/martinwoodward/covid-19-projects) of these repositories highlights the work of open source communities on topics including: global data, healthcare toolkits, machine-learning data sets, data visualization, stay at home help, hardware and local responses. Volunteer opportunities are also highlighted on this list, including assisting with large scale research through the crowdsourced computing project [Folding@Home](https://foldingathome.org/covid19/). This project is speeding up the rate at which researchers can obtain information about the coronavirus proteins and the forms they can take by tapping “[over four million CPUs and half a million GPUs](https://www.hpcwire.com/2020/05/20/lab-behind-the-record-setting-gpu-cloud-burst-joins-foldinghomes-covid-19-effort/)” from contributors donating their computer power since late March.
+
+Open source communities have also been effectively leveraging their human resources and reach through the creation of needed medical equipment and hardware, the establishment of disease tracking platforms, and facilitating the sharing and analysis of large data sets. Academic-based open source contributors have been particularly helpful in these areas. For example, numerous universities including the [Massachusett Institute of Technology](https://www.cnbc.com/2020/04/22/mit-volunteers-created-open-source-low-cost-ventilator-for-covid-19.html) and [University of Minnesota](https://www.twincities.com/2020/04/15/coronavirus-meet-the-university-of-minnesotas-coventor/) have developed open source designs for inexpensive ventilators with automated resuscitation bags to push oxygen into a patient’s lungs. The [data repository](https://github.com/CSSEGISandData/COVID-19) and [COVID-19 dashboard](https://www.arcgis.com/apps/opsdashboard/index.html#/bda7594740fd40299423467b48e9ecf6) developed by John Hopkins University (JHU) provides researchers, public health officials and the general public with a hands-on, user friendly tool to track the pandemic in real-time. The JHU project quickly became “[one of the canonical sources of data on the outbreak](https://github.blog/2020-03-23-open-collaboration-on-covid-19/)” used by scientists, journalists and statisticiations from all over the world.
+UCSC’s Genomics Institute (GI) created the [Genome Browser for SARS-CoV-2](https://genome.ucsc.edu/covid19.html) in order to continuously update “[access to the latest molecular data in a format in which all data can be quickly cross-referenced and compared](https://www.biorxiv.org/content/10.1101/2020.05.04.075945v1).” This new browser platform was adapted from GI’s existing open source genome browser visualization tool and inputs molecular data from published studies and database submissions in order to map to the viral genome. The SARS-CoV-2 browser aims to support the development of therapeutics and vaccines against the virus. UCSC faculty researchers are also developing an open source 3D browser that will gamify crowdsourcing of coronavirus data. This project, [recently funded by CITRIS Seed Funding](https://citris-uc.org/citris-seed-funding-covid-19-response-awarded-projects/), will create a platform that will enable experts to post challenges, allow crowdsourced annotations and data manipulation, leverage 3D rendering and virtual reality, and utilize gamification to stimulate data analysis and collaboration.
+
+The Center for Research in Open Source Software (CROSS) affiliated researchers are likewise working on projects aimed at supporting pandemic mitigation efforts. CROSS incubator fellow Ivo Jimenez, who leads the [Black Swan project](https://falsifiable.us/), is collaborating with the University of Sonora (Mexico) on a [dashboard](https://covid19data.unison.mx/) that displays daily summaries of COVID-19 data for the state of Sonora. Dr. Jimenez and his collaborators will be building a prediction model using unsupervised learning in order to estimate the location of new cases, based on geographical information that is available in the data provided by the Mexican government. Professor Gabriel Elkiam, advisor to CROSS’s newest research project [Open Source Autonomous Vehicles](https://cross.ucsc.edu/news/news/20200415newproject.html), is working with a team to develop a small, soft robot that extends via eversion (a so-called vine robot) that will be used for automated naso-pharyngeal swabbing to test for presence of the virus. The robot will carry a swab through the nostril back to the pharynx and retract depositing the swab in an appropriate collection container, allowing the test to be carried out without endangering healthcare workers.
+Open source communities, particularly contributors based in universities and research institutions, have been active in the global effort to fight the COVID-19 pandemic, and have increased the level of collaboration and speed at which data and resources are shared internationally. As the research, developer and medical communities continue to work collectively to develop effective solutions to end the pandemic and preserve publich health, open source projects and communities will continue to be vital to these efforts.
diff --git a/content/news/20200605/featured.png b/content/news/20200605/featured.png
new file mode 100644
index 00000000000..78d336481ae
Binary files /dev/null and b/content/news/20200605/featured.png differ
diff --git a/content/news/20200605/index.md b/content/news/20200605/index.md
new file mode 100644
index 00000000000..2390b96fb8f
--- /dev/null
+++ b/content/news/20200605/index.md
@@ -0,0 +1,34 @@
+---
+# Documentation: https://sourcethemes.com/academic/docs/managing-content/
+
+title: "Guest at the Sustain Podcast to talk about CROSS"
+subtitle: ""
+summary: "Touching on the history of CROSS, how one can get involved, and some of the great work CROSS Fellows are doing."
+authors:
+- carlosm
+tags: [cross]
+categories: [News]
+date: 2020-06-05T11:16:45-07:00
+lastmod: 2020-06-05T11:16:45-07:00
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Focal points: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight.
+image:
+ caption: ""
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects: [cross]
+---
+
+I had the pleasure to be on the [Sustain Podcast](https://sustain.codefund.fm/), talking with [Richard Littauer](https://www.linkedin.com/in/richard-littauer-130026138/) and [Justin Dorfman](https://www.linkedin.com/in/justindorfman/) about "[How $2 Million Dollars Helped Build CROSS](https://fireside.fm/s/fxw-Bcan+AX3AU4Sm)".
+
+
diff --git a/content/news/20200623/featured.png b/content/news/20200623/featured.png
new file mode 100644
index 00000000000..f83f1d5717b
Binary files /dev/null and b/content/news/20200623/featured.png differ
diff --git a/content/news/20200623/index.md b/content/news/20200623/index.md
new file mode 100644
index 00000000000..2297b7ba432
--- /dev/null
+++ b/content/news/20200623/index.md
@@ -0,0 +1,32 @@
+---
+# Documentation: https://sourcethemes.com/academic/docs/managing-content/
+
+title: "Jianshen Liu advanced to Candidacy!"
+subtitle: ""
+summary: "Congratulations!"
+authors:
+- carlosm
+tags: [cross, advancement]
+categories: [News]
+date: 2020-06-23T11:16:45-07:00
+lastmod: 2020-06-23T11:16:45-07:00
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Focal points: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight.
+image:
+ caption: ""
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects: [eusocial-storage]
+---
+
+Jianshen Liu passed his Advancement to Candidacy. Congratulations! Thanks also go to advancement committee members Scott Brandt, Peter Alvaro, and Matthew L. Curry (Sandia National Labs).
diff --git a/content/news/20200728/featured.png b/content/news/20200728/featured.png
new file mode 100644
index 00000000000..d8e134c9f3d
Binary files /dev/null and b/content/news/20200728/featured.png differ
diff --git a/content/news/20200728/index.md b/content/news/20200728/index.md
new file mode 100644
index 00000000000..7dc309535bb
--- /dev/null
+++ b/content/news/20200728/index.md
@@ -0,0 +1,32 @@
+---
+# Documentation: https://sourcethemes.com/academic/docs/managing-content/
+
+title: "CROSS Featured in Linux Professional Institute Blog Post"
+subtitle: "The Value of Open Source to Universities: UC Santa Cruz Tests the Water"
+summary: "Andy Oram interviewed me the other day about CROSS and its successes and opportunities"
+authors:
+- carlosm
+tags: [cross, interview]
+categories: [News]
+date: 2020-07-28T11:16:45-07:00
+lastmod: 2020-07-28T11:16:45-07:00
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Focal points: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight.
+image:
+ caption: ""
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects: [cross]
+---
+
+[Andy Oram](http://praxagora.com/) interviewed me the other day about CROSS and its successes and opportunities. The result is now [published](https://www.lpi.org/blog/2020/07/28/value-open-source-universities-uc-santa-cruz-tests-water) on the blog of the [Linux Professional Institute](https://www.lpi.org/about-lpi/our-purpose), a non-profit, vendor-neutral global certification body for open source professionals.
diff --git a/content/news/20210227/featured.png b/content/news/20210227/featured.png
new file mode 100644
index 00000000000..b5ae837bb9c
Binary files /dev/null and b/content/news/20210227/featured.png differ
diff --git a/content/news/20210227/index.md b/content/news/20210227/index.md
new file mode 100644
index 00000000000..21817897991
--- /dev/null
+++ b/content/news/20210227/index.md
@@ -0,0 +1,36 @@
+---
+# Documentation: https://sourcethemes.com/academic/docs/managing-content/
+
+title: "Chairing the SC21 Reproducibility Initiative"
+subtitle: "Enhancing reproducibility of accepted papers and leading the student reproducibility challenge."
+summary: "Enhancing reproducibility of accepted papers and leading the student reproducibility challenge."
+authors:
+- carlosm
+tags: [organization, reproducibility]
+categories: [News]
+date: 2021-02-27T17:00:00-08:00
+lastmod: 2021-02-27T17:00:00-08:00
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Focal points: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight.
+image:
+ caption: ""
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects: [reproducibility]
+---
+
+I'm honored to have been selected to chair the [SC21 Reproducibility Initiative](https://sc21.supercomputing.org/submit/reproducibility-initiative/) and to work with Reproducibility Initiative Vice Chair _Ivo Jimenez_ (UC Santa Cruz), Reproducibility Challenge Chair _Le Mai Weakley_ (Indiana University), Artifact Description and Artifact Evaluation Appendices Co-Chairs _Tanu Malik_ (DePaul University) and _Anjo Vahldiek-Oberwanger_ (Intel), Journal Special Issue Chair _Stephen Harrell_ (Texas Advanced Computing Center) and Journal Special Issue Vice Chair _Scott A. Michael_ (Indiana University).
+
+A particular focus of this year's Initiative will be advance Artifact Evaluation. Based on experience of top tier systems conferences [SOSP'19](https://sosp19.rcs.uwaterloo.ca/) and [OSDI'20](https://www.usenix.org/conference/osdi20), authors of accepted papers will be able to apply for badges certifying public availability of artifacts, artifacts passing review, and artifacts reproducing results.
+
+See [here](https://sc21.supercomputing.org/submit/reproducibility-initiative/) for details.
diff --git a/content/news/20210901/featured.png b/content/news/20210901/featured.png
new file mode 100644
index 00000000000..3ce15bd4a50
Binary files /dev/null and b/content/news/20210901/featured.png differ
diff --git a/content/news/20210901/index.md b/content/news/20210901/index.md
new file mode 100644
index 00000000000..14adedb8bf4
--- /dev/null
+++ b/content/news/20210901/index.md
@@ -0,0 +1,36 @@
+---
+# Documentation: https://sourcethemes.com/academic/docs/managing-content/
+
+title: "New in SC21: SC Best Reproducibility Advancemeent Award"
+subtitle: "Recognizing outstanding efforts in improving transparency and reproducibility of methods for high performance computing, storage, networking and analysis."
+summary: "Recognizing outstanding efforts in improving transparency and reproducibility of methods for high performance computing, storage, networking and analysis."
+authors:
+- carlosm
+tags: [organization, reproducibility]
+categories: [News]
+date: 2021-09-01T17:00:00-08:00
+lastmod: 2021-09-01T17:00:00-08:00
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Focal points: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight.
+image:
+ caption: ""
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects: [reproducibility]
+---
+
+As part of the [SC21 Reproducibility Initiative](https://sc21.supercomputing.org/submit/reproducibility-initiative/) that _Ivo Jimenez_ (Vectorized.io) and I are chairing we were able to establish the new [SC Best Reproducibility Advancement Award](https://sc21.supercomputing.org/program/awards/sc-best-reproducibility-advancement-award/) that recognizes outstanding efforts in improving transparency and reproducibility of methods for high performance computing, storage, networking and analysis. The new SC Award was approved by the SC Steering Committee on August 12, 2021 and will be awarded at future SC conferences beginning at this year's SC21. See [SC Best Reproducibility Advancement Award](https://sc21.supercomputing.org/program/awards/sc-best-reproducibility-advancement-award/) for details.
+
+Special thanks go to SC21 Artifact Description and Artifact Evaluation Appendices Co-Chairs _Tanu Malik_ (DePaul University) and _Anjo Vahldiek-Oberwagner_ (Intel) for authoring the Award abstract and for co-chairing the SC21 Best Reproducibility Advancement Award Co-Chairs selection committee. Special thanks go also to SC21 Technical Progam Chair _Mary Hall_ (University of Utah), SC21 Conference Chair _Bronis de Supinski_ (Lawrence Livermore National Laboratory), SC21 Conference Vice Chair _Jeffrey Hollingsworth_ (University of Maryland), SC21 Conference Deputy Chair _Candace Culhane_ (Los Alamos National Laboratory), and SC Steering Committee Member _Michela Taufer_ (University of Tennessee, Knoxville) for championing this Award.
+
+One important role of this Award is to streamline the pipeline from SC accepted papers to selecting a paper for the SC Student Cluster Competition's Reproducibility Challenge in the following year. I want to thank Reproducibility Challenge Chair _Le Mai Weakley_ (Indiana University), Journal Special Issue Chair _Stephen Harrell_ (Texas Advanced Computing Center) and Journal Special Issue Vice Chair _Scott A. Michael_ (Indiana University) for their great work and for their input on awardee selection.
\ No newline at end of file
diff --git a/content/news/20220518/featured.png b/content/news/20220518/featured.png
new file mode 100644
index 00000000000..6335ea00cd6
Binary files /dev/null and b/content/news/20220518/featured.png differ
diff --git a/content/news/20220518/index.md b/content/news/20220518/index.md
new file mode 100644
index 00000000000..0ae5aeadeed
--- /dev/null
+++ b/content/news/20220518/index.md
@@ -0,0 +1,55 @@
+---
+# Documentation: https://wowchemy.com/docs/managing-content/
+
+title: "OSPO UC Santa Cruz funded"
+subtitle: "Sloan Foundation funds two-year pilot open source program office at UC Santa Cruz"
+summary: "The Alfred P. Sloan Foundation awarded Professor Carlos Maltzahn a $695,000 two-year grant aimed at establishing an open source program office (OSPO) that will serve the entire campus community."
+authors: [slieggi]
+tags: [ospo]
+categories: [News]
+date: 2022-05-18T11:22:07-07:00
+lastmod: 2022-05-18T11:22:07-07:00
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Focal points: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight.
+image:
+ caption: ""
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects: [ospo]
+---
+
+_(This is a slightly edited version of [this UCSC news release](https://news.ucsc.edu/2022/05/sloan-foundation-program.html))._
+
+
+
+
+
+The [Alfred P. Sloan Foundation](https://sloan.org/) awarded Professor Carlos Maltzahn a $695,000 two-year grant aimed at establishing an open source program office (OSPO) that will serve the entire campus community. The new OSPO will be one of the first four in the United States to have such an office (John Hopkins, RIT, and Vermont are the others) and one of only two public universities.
+
+[Open Source Program Offices](https://todogroup.org/) ([OSPOs](https://todogroup.org/)) are an organizational innovation developed initially by companies in the tech sector as a way to manage the relationships with open source software ecosystems of strategic importance to their business. The UC Santa Cruz OSPO, together with the three other offices, are adapting this OSPO concept to build on the missions of research universities of research, teaching, and public service.
+
+"We're extremely excited about the establishment of university open source program offices and their potential to legitimize and improve the development of open source software across the research enterprise," said Joshua Greenberg, director of the Sloan Foundation's technology program. "We're especially proud to support the establishment of an office at UC Santa Cruz. This work is being led by an accomplished team and represents an exciting development for the campus—as well as offering a blueprint for other UC schools."
+
+The UC Santa Cruz OSPO will transcend the scope of Baskin Engineering’s [Center for Research for Open Source Software](https://cross.ucsc.edu/) ([CROSS](https://cross.ucsc.edu/)) but not replace it. CROSS will continue its education, research, and incubator programs, with an increased focus on the topics of interest of its industry sponsors. CROSS was started by Maltzahn in 2015 to bridge the gap between student work and successful open source projects. Since its establishment, CROSS has raised $2.6 million in industry funding.
+
+“By generously sponsoring CROSS, major corporations including Toshiba, Kioxia, Micron, SK Hynix, Seagate, Western Digital, Samsung, and Fujitsu clearly signaled that industry-university research collaboration via open source strategies and techniques works,” observed Maltzahn. “Thanks to their engagement and Sloan Foundation’s funding, we now have an opportunity to significantly expand this collaboration in innovative ways.”
+
+As noted by Interim Vice Chancellor for Research John MacMillan, “The UCSC Office of Research will work closely with the OSPO team at cross-divisional and corporate engagement and help track funding of open source projects within the university. We are pleased that the Sloan Foundation chose to fund UC Santa Cruz’s efforts in open source – making it the first UC campus with an OSPO. We think this pilot project will allow UC Santa Cruz to be a model for UC campuses and other public universities.”
+
+As part of the Sloan Foundation award, the OSPO introduces a new incubator fellowship aimed at attracting postdoctoral scholars whose compelling and innovative open source research projects make them natural open source ambassadors to UC Santa Cruz. The OSPO will establish a teaching fellowship to encourage innovation in teaching students how to productively engage with open source projects.. The OSPO will also promote and track funding of open source research efforts to help estimate the value of open source to UC Santa Cruz and the UC system as a whole.
+
+The OSPO team, led by Maltzahn and CROSS Assistant Director Stephanie Lieggi, runs the [Open Source Research Experience](https://cross.ucsc.edu/2022-osre/index.html) ([OSRE](https://cross.ucsc.edu/2022-osre/index.html)) program, an exchange of summer project ideas that matches student contributors and sponsors with mentors working on open source research efforts. As part of the OSPO, the OSRE will be able to expand and engage mentors across UC campuses and associated national labs.
+
+Maltzahn summarized: “The Open Source Research Experience is the kind of OSPO program we want: inclusive and easy to engage with, highly rewarding, scalable across multiple UC campuses, well-aligned with the goals and desires of stakeholders, and increasing in value with the number of contributors.”
+
+For more on the OSPO and how it works, check out [the recent feature](https://research.redhat.com/blog/article/building-a-university-ospo-bolstering-academic-research-through-open-source/) published by the Red Hat Research Quarterly and the OSPO UC Santa Cruz website, [ospo.ucsc.edu](https://ospo.ucsc.edu).
\ No newline at end of file
diff --git a/content/news/20220823/featured.png b/content/news/20220823/featured.png
new file mode 100644
index 00000000000..32e98b92b87
Binary files /dev/null and b/content/news/20220823/featured.png differ
diff --git a/content/news/20220823/index.md b/content/news/20220823/index.md
new file mode 100644
index 00000000000..9c650f8145d
--- /dev/null
+++ b/content/news/20220823/index.md
@@ -0,0 +1,115 @@
+---
+# Documentation: https://wowchemy.com/docs/managing-content/
+
+title: "Register: 2022 UC Santa Cruz Open Source Symposium"
+subtitle: "Registration is now open!"
+summary: "Registration is now open for the 2022 UC Santa Cruz Open Source Symposium: the hybrid event will take place at UC Santa Cruz on September 27-29, 2022 -- with the not-to-be-missed Systems Oktoberfest returning on the first day's evening at the lovely UC Santa Cruz Hay Barn!"
+authors:
+ - slieggi
+ - carlosm
+tags: [ospo]
+categories: [News]
+date: 2022-08-23T11:22:07-07:00
+lastmod: 2022-08-23T11:22:07-07:00
+featured: true
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Focal points: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight.
+image:
+ caption: ""
+ focal_point: "Smart"
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects: [ospo]
+---
+
+{{% cta cta_link="https://ucsc.irisregistration.com/Register/Form/Form?formId=10271" cta_text="Register →" %}}
+
+The UC Open Source Research Symposium (successor to the CROSS Research Symposia series) is happy to be able to mix in-person, hybrid and fully remote activities in an effort to provide a larger audience the opportunity to learn about the cutting edge open source research being done throughout the University of California. Participants can interact with UC faculty, graduate students, and affiliated researchers, and discuss future directions and discover areas of collaboration from campuses throughout the UC system.
+
+> Returning after a two year hiatus, the Symposium will once again include the annual UC Santa Cruz Systems Research Lab’s Oktoberfest barbeque on the evening of the first day of the symposium.
+
+This year’s event is organized by the [UC Santa Cruz Open Source Program Office](https://ospo.ucsc.edu/) (OSPO) and we thank our [sponsors](#sponsors) to help make this event possible. It will include keynote speakers, expert panels, and technical workshops, as well as poster presentations highlighting the work of our Open Source Research Experience students. This year we are excited to host keynotes by Demetris Cheatham (GitHub), Nithya Ruff (Amazon), Karsten Wade (Red Hat), and Stephen Walli (Microsoft).
+
+> Free for UC affiliated and remote participants
+> Cost for non-UC affiliated participants: $299
+
+Interested in sponsoring this event? Sponsors get complimentary registration, recognition and other benefits (see [below](#sponsor-benefits))
+
+{{% cta cta_link="https://ucsc.irisregistration.com/Register/Form/Form?formId=10271" cta_text="Register →" %}}
+
+# Agenda Overview
+
+- **Tue 9/27** @ UC Santa Cruz Hay Barn. Plenary (single-track) sessions. In-person participation limited to 50 people; remote participation also available.
+
+- **Wed 9/28** @ UC Santa Cruz Baskin Engineering 2, 5th Floor. Technical Workshops (two-track). In-person participation limited to 60 people; Remote participation also available.
+
+- **Thu 9/29**, Fully Remote. Technical Workshops (two-track).
+
+# Detailed Agenda (scroll)
+
+{{< gdocs src="https://docs.google.com/document/d/e/2PACX-1vTrtltPPiCV6ML5Y6Z7FmmKsAQiVpNpSMtuG4CAinroC_9dTN9543r9qHArcJMjRhWj0a-Hk-E-bJsJ/pub?embedded=true" >}}
+
+{{% cta cta_link="https://ucsc.irisregistration.com/Register/Form/Form?formId=10271" cta_text="Register →" %}}
+
+# Sponsors
+
+## Platinum
+
+[![Alfred P. Sloan Froundation](Logo-2B-SMALL-Gold-Blue.png)](https://sloan.org)
+
+[![Center for Research in Open Source Software - cross.ucsc.edu](SwagLogo.stickerCropped.png)](https://cross.ucsc.edu)
+
+## Gold
+
+## Silver
+
+# Sponsor Benefits
+
+Does your organization want to support and promote open source innovation in academia? Do you want to foster collaboration between industry, academia and open source communities? Why not be a sponsor for the 2022 Open Source Research Symposium?
+
+Organizations that sponsor this event will be given public acknowledgement on the Symposium website and agenda, and get free passes for in-person participation.
+
+>
+
+
+
+
Sponsorship Level
+
Amount
+
Benefits
+
+
+
+
+
Silver
+
$500
+
Acknowledgements + 1 free registration
+
+
+
Gold
+
$1,000
+
Acknowledgements + 2 free registrations
+
+
+
Platinum
+
$2,000
+
Acknowledgements + 4 free registrations
+
+
+
+
+If your organization might be interested in sponsoring, please contact Stephanie Lieggi ([slieggi@ucsc.edu](mailto:slieggi@ucsc.edu)).
\ No newline at end of file
diff --git a/content/news/20220827/featured.jpg b/content/news/20220827/featured.jpg
new file mode 100644
index 00000000000..96cea50661e
Binary files /dev/null and b/content/news/20220827/featured.jpg differ
diff --git a/content/news/20220827/index.md b/content/news/20220827/index.md
new file mode 100644
index 00000000000..aa2639233cb
--- /dev/null
+++ b/content/news/20220827/index.md
@@ -0,0 +1,56 @@
+---
+# Documentation: https://wowchemy.com/docs/managing-content/
+
+title: "NSF award will support project to promote reproducibility in computer science"
+subtitle: ""
+summary: "With the support of a three-year, $900,000 grant from the NSF, Carlos Maltzahn will participate in collaborative research to increase the reproducibility of computer science research."
+authors: ["Emily Cerf"]
+tags: []
+categories: [News]
+date: 2022-08-27T18:18:18-07:00
+lastmod: 2022-08-27T18:18:18-07:00
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Focal points: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight.
+image:
+ caption: ""
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects: [ospo, reproducibility]
+---
+_(This is a slightly edited version of [this UCSC news release](https://news.ucsc.edu/2022/08/maltzahn-fairos-award.html))._
+
+With the support of a three-year, $900,000 grant from the National Science Foundation (NSF), Adjunct Professor of Computer Science and Engineering Carlos Maltzahn and the UC Santa Cruz Center for Research in Open Source Software (CROSS) will participate in collaborative research to increase the reproducibility of computer science research.
+
+This grant comes from the inaugural year of an NSF initiative, called Findable Accessible Interoperable Reusable (FAIR) Open Science Research Coordination Networks ([FAIROS RCN](https://www.nsf.gov/pubs/2022/nsf22553/nsf22553.htm)), to create groups of researchers who lead by example to promote open science results and artifacts. Overall, [10 new project groups were funded](https://www.arl.org/news/arl-applauds-nsf-open-science-investment/) to pool $12.5 million into creating open source communities, which foster a vibrant exchange of artifacts within common infrastructure.
+
+“There's a huge shift going on,” said Maltzahn, director of UCSC CROSS. “I think it has to do with the realization of how much value the industry places on open source, and that open science and networks of expertise have to be more inclusive and involve stakeholders across academia, industry, government and open source communities. That becomes especially important when you talk about revitalizing U.S. high tech manufacturing.”
+
+Maltzahn will work with the Repeto project, a group focused on practical reproducibility in computer science research. Reproducibility allows researchers to verify findings, accelerate the research process to more quickly gain insights, and have their products more widely used in research labs, classrooms, and industry. It also helps students gain a deeper understanding of the original researcher’s thought process.
+
+Involving researchers from the University of Chicago and New York University, the [Repeto project](https://www.nsf.gov/awardsearch/showAward?AWD_ID=2226407) strives to make the reproducibility of computer science practical – where many experiments can be repeated cost-effectively. Overall, they will create infrastructure, teach and mentor students, lead workshops, and create community best practices related to this goal.
+
+Through these efforts, Maltzahn and his collaborators hope to better understand and foster the “market of reproducibility” to ensure that artifacts, such as pieces of software, are available for replication, but that those artifacts are both useful and used.
+
+“The aim of Repeto is that both creating reproducible artifacts is really easy, and consuming those artifacts is really easy,” Maltzahn said. “The overall thought is that convenient reproducibility artifacts will accelerate the cycle of research, so you will get a much faster succession of insights and a powerful toolkit to improve student training.”
+
+UCSC’s role in this project will be to convene a world-wide program in 2023 called the “Summer of Reproducibility,” following the model of CROSS’s [Open Source Research Experience](https://ospo.ucsc.edu/post/osre/) program, which matches students with mentors working on open source projects. Similarly, for the Repeto project, undergraduate students participating in the Summer of Reproducibility will work to replicate a published piece of research.
+
+This will allow students to gain a deeper understanding of the experiments they have repeated as compared to just reading about them, and allow the mentors to better understand what is needed in order for their work to be truly reproducible.
+
+Maltzahn will collaborate with Assistant Director of UCSC CROSS Stephanie Lieggi to put on the Summer of Reproducibility. He will collaborate with lead principal investigator Kate Keahey, senior computer scientist at Argonne National Lab and CASE Senior Scientist affiliated with the Department of Computer Science at the University of Chicago, Haryadi Gunawi, associate professor of computer science at University of Chicago, and Fraida Fund, research assistant professor in electrical and computer engineering at NYU.
+
+The University of Chicago researchers will focus on building and maintaining the infrastructure to make practical reproducibility possible. They will also convene workshops on topics around reproducibility. NYU will focus on best practices for teaching and applying reproducibility in the classroom.
+
+With connections made through researchers interested in reproducibility as part of the [Association for Computer Machinery Emerging Interest Group for Reproducibility](https://reproducibility.acm.org/) effort, Maltzahn and team have created an international steering committee for the project, involving people across disciplines including computer science, library science, and social science.
+
+All of the [10 FAIROS RCN awardee groups](https://www.arl.org/news/arl-applauds-nsf-open-science-investment/) are expected to work together in sharing artifacts and will have monthly meetings led by the program director. UC San Diego’s Supercomputer Center (SDSC) is also part of the cohort, making the UC system a major participant in this initiative. Members of the cohort such as the North Carolina Central University provide exciting outreach opportunities to students and faculty at Historically Black Colleges and Universities (HBCUs) and Minority-Serving Institutions (MCIs).
diff --git a/content/news/20220909/featured.png b/content/news/20220909/featured.png
new file mode 100644
index 00000000000..f02e34da1eb
Binary files /dev/null and b/content/news/20220909/featured.png differ
diff --git a/content/news/20220909/index.md b/content/news/20220909/index.md
new file mode 100644
index 00000000000..5f7112562ee
--- /dev/null
+++ b/content/news/20220909/index.md
@@ -0,0 +1,37 @@
+---
+# Documentation: https://wowchemy.com/docs/managing-content/
+
+title: "NSF will fund design of a support infrastructure for CROSS incubator project"
+subtitle: ""
+summary: "With the support of a one-year, $300,000 grant from the NSF, PI Carlos Maltzahn and co-PI Stephanie Lieggi will explore sustainable support infrastructures for the Skyhook Data Management project."
+authors: [carlosm]
+tags: []
+categories: [News]
+date: 2022-09-09T19:15:56-07:00
+lastmod: 2022-09-09T19:15:56-07:00
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Focal points: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight.
+image:
+ caption: ""
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects: [ospo, skyhook]
+---
+
+I am delighted to be one of the inaugural Phase 1 awardees of the [NSF/TIP](https://beta.nsf.gov/tip/latest) [Pathways to Enable Open Source Ecosystems (POSE) program](https://beta.nsf.gov/funding/opportunities/pathways-enable-open-source-ecosystems-pose), with [Stephanie Lieggi]({{< relref "/authors/slieggi" >}}) as co-PI. The goal of this 1-year, $300,000 project is to explore support infrastructures for [Skyhook Data Management]({{< relref "/project/skyhook" >}}), a graduated CROSS incubator project.
+
+**Title:** POSE: Phase I: Scoping the Ecosystem of Skyhook Data Management
+
+**Abstract:** New, well-funded, and fast-moving open source ecosystems around big data and data science have emerged due to the successful business models in hyperscale computing industries. These include the Apache Arrow ecosystem for processing structured data and the Ceph distributed storage ecosystem. Skyhook Data Management embeds Apache Arrow in Ceph and is a result of years of storage systems research at UC Santa Cruz where Ceph originated. Embedding processing of data into storage can dramatically reduce data movement, a major cost center in datacenters. This Phase 1 project explores sustainable and effective pathways for establishing open source as an alternative translation for technologies using Skyhook as a pilot project. The project’s novelties are a series of workshops which are convening open source experts and community leaders with diverse backgrounds to figure out governance, staffing and staff retention strategies for Skyhook while also building out expertise for open tech transfer within the university. As co-founder of the Ceph project, as founder and leader of the UC Santa Cruz Center for Research in Open Source Software and the Open Source Program Office UC Santa Cruz the investigators are well-positioned to convene these workshops due to their professional network of open source experts in industry and foundations. An important focus in these workshops is inclusiveness to foster a diverse community and encourage participation from historically excluded communities. The project’s impacts are the adoption of Skyhook technology for production, for reproducible research prototyping, and as a teaching tool in classrooms, and the establishment of open source as a viable translation path of technologies for research universities.
+
+Apache Arrow is a representation of columnar data in memory which has created a wide-ranging and rapidly growing open source ecosystem of efficient data processing with many different programming language bindings (so far C, C++, C#, Go, Java, JavaScript, Julia, MATLAB, Python, R, Ruby, and Rust). Due to the common representation, data can move efficiently without conversion between the ecosystem's processing engines running on different systems. Data processing and exchange can be implemented with a number of building blocks that includes the Parquet file format, the Flight framework for efficient data interchange between processes, the Gandiva LLVM-based JIT computation for executing analytical expressions by leveraging modern CPU SIMD instructions to process Arrow data, the Acero streaming execution engine for query plans, and Awkward Array for restructuring computation on columnar and nested data. On top of these building blocks exist a number of Arrow integration frameworks, including the Fletcher framework that integrates FPGAs with Apache Arrow, Nvidia’s RAPIDS cuDF framework that integrates GPUs with Apache Arrow, the Plasma high-performance shared-memory object store, and the Substrait open format for query plans between query optimizers and processing engines. Skyhook integrates Ceph with Arrow by embedding Arrow processing engines within Ceph storage objects such that objects can be accessed via Apache Arrow API calls that are executed on a storage server. API calls are atomic and, in case of failures, Ceph automatically remaps the call to another server where the object is available due to storage redundancy. Skyhook aims to become a research prototyping ecosystem and a blueprint for efficiently embedding data processing libraries in storage systems and computational storage devices while enabling processing and storage ecosystems to evolve independently.
\ No newline at end of file
diff --git a/content/news/20221026/featured.jpg b/content/news/20221026/featured.jpg
new file mode 100644
index 00000000000..7350cdfff91
Binary files /dev/null and b/content/news/20221026/featured.jpg differ
diff --git a/content/news/20221026/index.md b/content/news/20221026/index.md
new file mode 100644
index 00000000000..28b6a7b4267
--- /dev/null
+++ b/content/news/20221026/index.md
@@ -0,0 +1,35 @@
+---
+# Documentation: https://wowchemy.com/docs/managing-content/
+
+title: "Mentoring FARR Fellow Milad Hakimshafaei"
+subtitle: "Investigating bias in technology through Baskin Engineering's anti-racism research fellowship"
+summary: "Incubator Fellow and PolyPhy Project Lead Oskar Elek and Stephanie Lieggi, executive director for OSPO and the Center for Research in Open Source Software, Hakimshafaei will develop an interface built off of an existing web-based AI generator that will allow the end user to control the system’s behavior to generate interesting and unique patterns."
+authors: [carlosm]
+tags: []
+categories: [News]
+date: 2022-10-26T11:15:56-07:00
+lastmod: 2022-10-26T11:15:56-07:00
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Focal points: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight.
+image:
+ caption: ""
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects: [ospo, polyphy]
+---
+
+I'm very proud to have the OSPO UC Santa Cruz play a part of [Baskin Engineering Inclusive Excellence Hub](https://ieh.soe.ucsc.edu/)'s Fellowship for Anti-Racism Research (FARR) by providing mentorship to computational media Ph.D. student Milad Hakimshafaei in his work to expand diverse perceptions in visual design.
+
+>Incubator Fellow and PolyPhy Project Lead {{% mention oelek %}} and {{% mention slieggi %}}, executive director for OSPO and the Center for Research in Open Source Software, Hakimshafaei will develop an interface built off of an existing web-based AI generator that will allow the end user to control the system’s behavior to generate interesting and unique patterns. By studying a broad array of users’ visual design aesthetic preferences, he plans to develop algorithms that better represent the experience of beauty for different people.
+
+I encourage you to read the [full news release](https://news.ucsc.edu/2022/10/fellowship-for-anti-racism-research-2022.html).
\ No newline at end of file
diff --git a/content/news/20230222/featured.jpg b/content/news/20230222/featured.jpg
new file mode 100644
index 00000000000..aa1e9e4fe2e
Binary files /dev/null and b/content/news/20230222/featured.jpg differ
diff --git a/content/news/20230222/index.md b/content/news/20230222/index.md
new file mode 100644
index 00000000000..654fb016d41
--- /dev/null
+++ b/content/news/20230222/index.md
@@ -0,0 +1,22 @@
+---
+title: "Accepted as 2023 GSoC Mentor Organiztion"
+subtitle: "OSRE 2023 and SoR 2023 will benefit from GSOC outreach and funding."
+#summary: "Drop in and ask us questions about this summer program starting at 10:30am Pacific Time on January 26, 2023. Take a look at info about these programs on this website or watch one of the earlier [videos](https://youtube.com/playlist?list=PLgEgostMUSe0uH-iqE3kUbsb-W_LRZaLv). We will give a brief overview of the program and discuss the benefits of being a Summer of Reproducibility mentor, a joint program with the NSF-funded Repeto Project."
+authors: [slieggi, carlosm]
+tags: [sor,osre]
+categories: [News]
+date: 2023-02-22
+lastmod: 2023-02-22
+featured: false
+draft: false
+active: true
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects: [ospo]
+---
+
+We are very excited to have been selected as a [2023 Google Summer of Code](https://summerofcode.withgoogle.com) Mentor Organization. The [2023 Open Source Research Experience](https://ospo.ucsc.edu/osre) and with it the [2023 Summer of Reproducibility](https://ospo.ucsc.edu/sor) will therefore benefit from GSOC's world-wide outreach and generous funding of contributors. We are now listed among [172 selected mentor organizations](https://summerofcode.withgoogle.com/programs/2023/organizations). We are listed under "Data" and "Science and medicine".
\ No newline at end of file
diff --git a/content/news/20230607/featured.png b/content/news/20230607/featured.png
new file mode 100644
index 00000000000..de1eeef454b
Binary files /dev/null and b/content/news/20230607/featured.png differ
diff --git a/content/news/20230607/featured.pptx b/content/news/20230607/featured.pptx
new file mode 100644
index 00000000000..68e62c42224
Binary files /dev/null and b/content/news/20230607/featured.pptx differ
diff --git a/content/news/20230607/index.md b/content/news/20230607/index.md
new file mode 100644
index 00000000000..7e78155a962
--- /dev/null
+++ b/content/news/20230607/index.md
@@ -0,0 +1,37 @@
+---
+title: "Presidential Chair Appointment"
+subtitle: "Appointed as the Sage Weil Presidential Chair for Open Source Software."
+#summary: "Drop in and ask us questions about this summer program starting at 10:30am Pacific Time on January 26, 2023. Take a look at info about these programs on this website or watch one of the earlier [videos](https://youtube.com/playlist?list=PLgEgostMUSe0uH-iqE3kUbsb-W_LRZaLv). We will give a brief overview of the program and discuss the benefits of being a Summer of Reproducibility mentor, a joint program with the NSF-funded Repeto Project."
+authors: [carlosm]
+tags: [cross]
+categories: [News]
+date: 2023-06-07
+lastmod: 2023-06-07
+featured: false
+draft: false
+active: true
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects: [cross]
+---
+
+Today I was notified by the Chancellor that I will be appointed as the Sage Weil Presidential Chair for Open Source Software. This is an incredible honor and I thank those who nominated and selected me. And I thank Sage Weil.
+
+The Sage Weil Presidential Chair for Open Source Software was endowed with a $500,000 donation by [Sage Weil](https://www.linkedin.com/in/sageweil/) who founded the [Ceph project](https://ceph.io) as part of his Ph.D. project at UC Santa Cruz, plus a $500,000 match by the President of the University of California. The Chair was first awarded to Sage's advisor [Scott Brandt](https://www.linkedin.com/in/scott-brandt-074177/) who relinquished it last year.
+
+Here is what I wrote back in January in my statement in response to my nomination:
+
+> Sage Weil expressed his wish on June 2, 2014 at a dinner meeting with Scott Brandt and me to help with establishing a structure at the university that would give other students a similar PhD career as he enjoyed. Sage had just sold his startup Inktank to Red Hat for $175,000,000 and the meeting was about ways he could give back to UC Santa Cruz where he had created the Ceph open source distributed storage system, the technology that Inktank was based on. When I mentioned the Open Source Programming course that then PhD student Andrew Shewmaker and I were developing for teaching undergraduate students how to be productive in open source communities, Sage immediately recognized it as a great starting point for helping students to increase their research and education impact with open source techniques and strategies. The result of the conversation was Sage giving $500,000 to fund the Sage Weil Presidential Chair for Open Source Software (matched with $500,000 by the UC Office of the President) and $2,000,000 to me for research in open source software.
+
+> In 2015, as a result of Sage’s gift, I founded the [Center for Research in Open Source Software](cross.ucsc.edu) and since then raised well over $2,000,000 in membership fees, primarily from electronic component makers who were greatly benefiting from Ceph as a way to sell components directly to customers, a much larger market than the traditional market controlled by a few system vendors, and who were excited about the development of new open source technologies as a catalyst for new markets for innovative components. In 2016 I recruited [Stephanie Lieggi](https://www.linkedin.com/in/stephanie-lieggi-8542624/) as the executive director of CROSS.
+
+> In 2022, as a result of the outstanding success of CROSS, Stephanie and I received a grant from Alfred P. Sloan Foundation to establish an [Open Source Program Office](ospo.ucsc.edu) at UC Santa Cruz that establishes and runs new programs to amplify impact of university research and to make the value of open source to the university more visible. Programs established so far are the [Open Source Incubator Fellowship](https://ospo.ucsc.edu/post/incubator/) which supports postdoctoral scholars with research agendas that serve as outstanding examples of how to leverage open source communities in research, the [Open Source Research Experience](https://ospo.ucsc.edu/osre/) (OSRE) which pairs UC mentors (and mentors from associated national labs) with summer students found via world-wide outreach programs such as the [Google Summer of Code](https://summerofcode.withgoogle.com/), and the [Summer of Reproducibility](https://ospo.ucsc.edu/sor/) (SoR) which extends OSRE by pairing summer students with researchers and lecturers who are interested in advancing reproducibility in computing. SoR is a joint effort with the NSF-funded [REPETO project](https://voices.uchicago.edu/repeto/).
+
+> Alfred P. Sloan Foundation now issued a general call for proposals of university OSPOs and we are working with other UC campuses on responses with the intent to build a network of OSPOs across the UC system. We are also working with UC Santa Cruz Baskin Engineering faculty to establish a research center similar to CROSS but with a focus on open source hardware. We are hiring to expand the OSPO team, and we have a number of new programs addressing open source sponsorships of UC research, staffing software engineering support for open source ecosystems, and mapping research efforts to relevant open source communities and vice versa.
+If I were to be selected for the Sage Weil Presidential Chair for Open Source Software, I would use the scholarly allowance to help establish the OSPO UC Santa Cruz and its programs and to help with the development of research centers like CROSS.
+
+The scholarly allowance will be put to good use!
\ No newline at end of file
diff --git a/content/news/20230621/featured.png b/content/news/20230621/featured.png
new file mode 100644
index 00000000000..998aba7adac
Binary files /dev/null and b/content/news/20230621/featured.png differ
diff --git a/content/news/20230621/index.md b/content/news/20230621/index.md
new file mode 100644
index 00000000000..6e21585a2a5
--- /dev/null
+++ b/content/news/20230621/index.md
@@ -0,0 +1,27 @@
+---
+title: "Congratulations, Dr. Liu!"
+summary: 'Please join Peter Alvaro, Scott Brandt, Craig Ulmer, and me in congratulating Dr. Jianshen Liu on his successful Ph.D. defense today on "Extending Composable Data Services to Embedded Systems".'
+authors: [admin]
+tags: [Presentation]
+categories: [News]
+date: 2023-06-21
+lastmod: 2023-06-21
+featured: false
+draft: false
+active: true
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects:
+- smartnic
+- eusocial-storage
+- skyhook
+- cross
+---
+
+Please join Peter Alvaro, Scott Brandt, Craig Ulmer (Sandia National Labs), and me in congratulating Dr. Jianshen Liu on his successful Ph.D. defense today on "Extending Composable Data Services to Embedded Systems".
+
+Well done, Jianshen!
diff --git a/content/news/20231201/featured.png b/content/news/20231201/featured.png
new file mode 100644
index 00000000000..70d5f334bc8
Binary files /dev/null and b/content/news/20231201/featured.png differ
diff --git a/content/news/20231201/index.md b/content/news/20231201/index.md
new file mode 100644
index 00000000000..a6cb2933795
--- /dev/null
+++ b/content/news/20231201/index.md
@@ -0,0 +1,24 @@
+---
+title: "Farid Zakaria advanced to Candidacy!"
+summary: 'Farid Zakaria passed his Advancement to Candidacy. Congratulations! Thanks also go to their committee members Jim Whitehead (chair), Andrew Quinn (co-advisor), Todd Gamblin (LLNL), and Tom Scogland (LLNL).'
+authors: [admin]
+tags: [reproducibility]
+categories: [News]
+date: 2023-12-01
+lastmod: 2023-12-01
+featured: false
+draft: false
+active: true
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects:
+- packaging
+---
+
+Farid Zakaria passed his Advancement to Candidacy. Congratulations! Thanks also go to their committee members Jim Whitehead (chair), Andrew Quinn (co-advisor), Todd Gamblin (LLNL), and Tom Scogland (LLNL).
+
+Well done, Farid!
diff --git a/content/news/20231215/featured.jpg b/content/news/20231215/featured.jpg
new file mode 100644
index 00000000000..522563113fc
Binary files /dev/null and b/content/news/20231215/featured.jpg differ
diff --git a/content/news/20231215/index.md b/content/news/20231215/index.md
new file mode 100644
index 00000000000..fd891bab4e4
--- /dev/null
+++ b/content/news/20231215/index.md
@@ -0,0 +1,43 @@
+---
+title: "I'm retiring!"
+summary: 'I retired from UC Santa Cruz on December 15, 2023. Both CROSS and the OSPO UC Santa Cruz continue under the leadership of James Davis, Stephanie Lieggi, and Emily Lovell.'
+authors: [admin]
+tags: [Website]
+categories: [News]
+date: 2023-12-15
+lastmod: 2023-12-15
+featured: false
+draft: false
+active: true
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects:
+- cross
+---
+
+After 19 years at UC Santa Cruz I retired on December 15, 2023. The reason is entirely personal. I found my time at UC Santa Cruz extremely fulfilling, and I really enjoyed working with all of you!
+
+I’m aware that this announcement comes rather sudden. In fact, I planned this step for a while. However as a soft money guy I won’t get funding for my group if I announce my retirement too early. So it’s a caught-a-tiger-by-the-tail kind of situation. I hope for your understanding.
+
+I'm really happy about CROSS and the OSPO being able to continue in very capable hands: [Stephanie Lieggi](https://ucsc-ospo.github.io/author/stephanie-lieggi/) will continue as executive director and [James Davis is succeeding me as faculty director](https://news.ucsc.edu/2024/02/davis-cross-director.html) — both are very familiar with the structure, vision, and finances of CROSS and the OSPO (James Davis has been serving on the CROSS Advisory Committee for a number of years). Also part of the leadership team will be [Emily Lovell](https://ucsc-ospo.github.io/author/emily-lovell/).
+
+James fully supports the vision of the OSPO UC Santa Cruz and our effort to expand across multiple UC campuses with the “UC Network of OSPOs”. Stephanie is leading that amazing six-campus-spanning effort, including UC Berkeley, UC Davis, UCLA, UCSB, UCSC, and UCSD. We are also seeing progress towards getting the OSPO UC Santa Cruz established as part of the Office of Research and work on having Stephanie lead it.
+
+The new PIs and co-PIs for my ongoing grants are: [Cormac Flanagan](https://users.soe.ucsc.edu/~cormac/) and [Stephanie Lieggi](https://ucsc-ospo.github.io/author/stephanie-lieggi/) for [REPETO](https://repeto.cs.uchicago.edu/), [Heiner Litz](https://www.linkedin.com/in/heiner-litz-3a332713/) for [IRIS-HEP](https://iris-hep.org/), and [Andrew Quinn](https://arquinn.github.io/) for the LLNL subcontract with [Todd Gamblin](https://www.linkedin.com/in/tgamblin/).
+
+Foremost I thank my graduated Ph.D. students [Sasha Ames](http://www.linkedin.com/in/sashaames) (2011), [Joe Buck](http://www.linkedin.com/pub/joe-buck/3/70a/97a) (2014), [Dimitris Skourtis](http://www.linkedin.com/in/skourtis) (2014), [Adam Crume](http://www.linkedin.com/pub/adam-crume/48/7b3/330) (2015), [Andrew Shewmaker](http://www.linkedin.com/in/ashewmaker) (2016), [Lucho Ionkov](http://www.linkedin.com/pub/latchesar-ionkov/2/b9b/768) (2018), [Michael Sevilla](http://www.linkedin.com/in/michaelandrewsevilla) (2018), [Noah Watkins](http://www.linkedin.com/in/noahwatkins) (2018), [Ivo Jimenez](http://www.linkedin.com/in/ivotron) (2019), and [Jianshen Liu](https://www.linkedin.com/in/jianshenliu/) (2023). Their passionate curiosity, never-ending sense of wonder, and hard work were the real reason why anything got accomplished.
+
+I want to apologize to my current Ph.D. students for forcing them to pick a new advisor in the middle of their UC Santa Cruz careers. But I'm very happy that they all found great new advisors: [Esmaeil Mirvakili](https://www.linkedin.com/in/esmaeil-m-12a71879/) is now working with [Chen Qian](https://www.linkedin.com/in/chen-qian-7b59b521/), [Farid Zakaria](https://www.linkedin.com/in/fmzakari/) with [Andrew Quinn](https://arquinn.github.io/), and [Jayjeet Chakraborty](https://www.linkedin.com/in/jayjeet-chakraborty-077579162/) with [Heiner Litz](https://www.linkedin.com/in/heiner-litz-3a332713/).
+
+I thank [Sage Weil](https://www.linkedin.com/in/sageweil/) for creating Ceph, for the initial spark of an idea and incredible generosity without which CROSS would not exist, for lending his amazing track record and credibility to CROSS that was essential for recruiting our first industry members, and for sharing his technical expertise in many CROSS research and incubator project reviews. I thank [Karen Sandler](https://www.linkedin.com/in/karensandler/) for writing part of the CROSS membership agreement and, together with [Nissa Strottman](https://www.linkedin.com/in/nissastrottman/), sharing their invaluable legal expertise and experience. I thank [Doug Cutting](https://www.linkedin.com/in/cutting/) for his great guest lectures, sharing his technical expertise, and explaining the Apache Software Foundation to us. I thank [Nithya Ruff](https://www.linkedin.com/in/nithyaruff/) for championing CROSS at all the important places, and for connecting us to OSPO++, a fantastic community that is helping us establish an OSPO at UC Santa Cruz and other UC campuses.
+
+I thank Lavinia Preston, Cynthia McCarley, and Genine Scelfo as well as so many more fantastic individuals in the leadership and staff on all levels of the UC Santa Cruz campus for making everything work and for their skillful navigation of the UC Santa Cruz bureaucracy.
+
+Finally but not least, I thank [Scott Brandt](https://www.linkedin.com/in/scott-brandt-074177/) for his invaluable support and mentorship without which my career at UC Santa Cruz would not have been possible. It all started when he suggested to me in early 2004, during my leave from industry, to check out the storage systems research group meetings at UC Santa Cruz.
+
+Thank you all!
+
diff --git a/content/post/_index.md b/content/news/_index.md
similarity index 100%
rename from content/post/_index.md
rename to content/news/_index.md
diff --git a/content/news/first-post/index.md b/content/news/first-post/index.md
new file mode 100644
index 00000000000..cc331e38be3
--- /dev/null
+++ b/content/news/first-post/index.md
@@ -0,0 +1,104 @@
+---
+title: 'Academic: the website builder for Hugo'
+subtitle: 'Create a beautifully simple website in under 10 minutes :rocket:'
+summary: Create a beautifully simple website in under 10 minutes.
+authors:
+- admin
+tags:
+- Academic
+categories:
+- Demo
+date: "2016-04-20T00:00:00Z"
+lastmod: "2019-04-17T00:00:00Z"
+featured: false
+draft: true
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Placement options: 1 = Full column width, 2 = Out-set, 3 = Screen-width
+# Focal point options: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight
+image:
+ placement: 2
+ caption: 'Image credit: [**Unsplash**](https://unsplash.com/photos/CpkOjOcXdUY)'
+ focal_point: ""
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects: []
+---
+
+**Create a free website with Academic using Markdown, Jupyter, or RStudio. Choose a beautiful color theme and build anything with the Page Builder - over 40 _widgets_, _themes_, and _language packs_ included!**
+
+[Check out the latest **demo**](https://academic-demo.netlify.com/) of what you'll get in less than 10 minutes, or [view the **showcase**](https://sourcethemes.com/academic/#expo) of personal, project, and business sites.
+
+- 👉 [**Get Started**](#install)
+- 📚 [View the **documentation**](https://sourcethemes.com/academic/docs/)
+- 💬 [**Ask a question** on the forum](https://discourse.gohugo.io)
+- 👥 [Chat with the **community**](https://spectrum.chat/academic)
+- 🐦 Twitter: [@source_themes](https://twitter.com/source_themes) [@GeorgeCushen](https://twitter.com/GeorgeCushen) [#MadeWithAcademic](https://twitter.com/search?q=%23MadeWithAcademic&src=typd)
+- 💡 [Request a **feature** or report a **bug**](https://github.com/gcushen/hugo-academic/issues)
+- ⬆️ **Updating?** View the [Update Guide](https://sourcethemes.com/academic/docs/update/) and [Release Notes](https://sourcethemes.com/academic/updates/)
+- :heart: **Support development** of Academic:
+ - ☕️ [**Donate a coffee**](https://paypal.me/cushen)
+ - 💵 [Become a backer on **Patreon**](https://www.patreon.com/cushen)
+ - 🖼️ [Decorate your laptop or journal with an Academic **sticker**](https://www.redbubble.com/people/neutreno/works/34387919-academic)
+ - 👕 [Wear the **T-shirt**](https://academic.threadless.com/)
+ - :woman_technologist: [**Contribute**](https://sourcethemes.com/academic/docs/contribute/)
+
+{{< figure src="https://raw.githubusercontent.com/gcushen/hugo-academic/master/academic.png" title="Academic is mobile first with a responsive design to ensure that your site looks stunning on every device." >}}
+
+**Key features:**
+
+- **Page builder** - Create *anything* with [**widgets**](https://sourcethemes.com/academic/docs/page-builder/) and [**elements**](https://sourcethemes.com/academic/docs/writing-markdown-latex/)
+- **Edit any type of content** - Blog posts, publications, talks, slides, projects, and more!
+- **Create content** in [**Markdown**](https://sourcethemes.com/academic/docs/writing-markdown-latex/), [**Jupyter**](https://sourcethemes.com/academic/docs/jupyter/), or [**RStudio**](https://sourcethemes.com/academic/docs/install/#install-with-rstudio)
+- **Plugin System** - Fully customizable [**color** and **font themes**](https://sourcethemes.com/academic/themes/)
+- **Display Code and Math** - Code highlighting and [LaTeX math](https://en.wikibooks.org/wiki/LaTeX/Mathematics) supported
+- **Integrations** - [Google Analytics](https://analytics.google.com), [Disqus commenting](https://disqus.com), Maps, Contact Forms, and more!
+- **Beautiful Site** - Simple and refreshing one page design
+- **Industry-Leading SEO** - Help get your website found on search engines and social media
+- **Media Galleries** - Display your images and videos with captions in a customizable gallery
+- **Mobile Friendly** - Look amazing on every screen with a mobile friendly version of your site
+- **Multi-language** - 15+ language packs including English, 中文, and Português
+- **Multi-user** - Each author gets their own profile page
+- **Privacy Pack** - Assists with GDPR
+- **Stand Out** - Bring your site to life with animation, parallax backgrounds, and scroll effects
+- **One-Click Deployment** - No servers. No databases. Only files.
+
+## Themes
+
+Academic comes with **automatic day (light) and night (dark) mode** built-in. Alternatively, visitors can choose their preferred mode - click the sun/moon icon in the top right of the [Demo](https://academic-demo.netlify.com/) to see it in action! Day/night mode can also be disabled by the site admin in `params.toml`.
+
+[Choose a stunning **theme** and **font**](https://sourcethemes.com/academic/themes/) for your site. Themes are fully [customizable](https://sourcethemes.com/academic/docs/customization/#custom-theme).
+
+## Ecosystem
+
+* **[Academic Admin](https://github.com/sourcethemes/academic-admin):** An admin tool to import publications from BibTeX or import assets for an offline site
+* **[Academic Scripts](https://github.com/sourcethemes/academic-scripts):** Scripts to help migrate content to new versions of Academic
+
+## Install
+
+You can choose from one of the following four methods to install:
+
+* [**one-click install using your web browser (recommended)**](https://sourcethemes.com/academic/docs/install/#install-with-web-browser)
+* [install on your computer using **Git** with the Command Prompt/Terminal app](https://sourcethemes.com/academic/docs/install/#install-with-git)
+* [install on your computer by downloading the **ZIP files**](https://sourcethemes.com/academic/docs/install/#install-with-zip)
+* [install on your computer with **RStudio**](https://sourcethemes.com/academic/docs/install/#install-with-rstudio)
+
+Then [personalize and deploy your new site](https://sourcethemes.com/academic/docs/get-started/).
+
+## Updating
+
+[View the Update Guide](https://sourcethemes.com/academic/docs/update/).
+
+Feel free to *star* the project on [Github](https://github.com/gcushen/hugo-academic/) to help keep track of [updates](https://sourcethemes.com/academic/updates).
+
+## License
+
+Copyright 2016-present [George Cushen](https://georgecushen.com).
+
+Released under the [MIT](https://github.com/gcushen/hugo-academic/blob/master/LICENSE.md) license.
diff --git a/content/project/.DS_Store b/content/project/.DS_Store
new file mode 100644
index 00000000000..b577689ab2b
Binary files /dev/null and b/content/project/.DS_Store differ
diff --git a/content/project/big-weather-web/.DS_Store b/content/project/big-weather-web/.DS_Store
new file mode 100644
index 00000000000..5008ddfcf53
Binary files /dev/null and b/content/project/big-weather-web/.DS_Store differ
diff --git a/content/project/big-weather-web/featured.png b/content/project/big-weather-web/featured.png
new file mode 100644
index 00000000000..5911fc73620
Binary files /dev/null and b/content/project/big-weather-web/featured.png differ
diff --git a/content/project/big-weather-web/featured1.png b/content/project/big-weather-web/featured1.png
new file mode 100644
index 00000000000..12e6f380648
Binary files /dev/null and b/content/project/big-weather-web/featured1.png differ
diff --git a/content/project/big-weather-web/index.md b/content/project/big-weather-web/index.md
new file mode 100644
index 00000000000..980b63f958f
--- /dev/null
+++ b/content/project/big-weather-web/index.md
@@ -0,0 +1,27 @@
+---
+title: "Big Weather Web"
+summary: "Making a community infrastructure for reproducible numerical weather prediction."
+tags:
+- reproducibility
+date: "2016-04-23T00:00:00Z"
+
+# Optional external URL for project (replaces project detail page).
+external_link: "http://bigweatherweb.org"
+
+image:
+ caption: ""
+ focal_point: Smart
+
+links: []
+url_code: ""
+url_pdf: ""
+url_slides: ""
+url_video: ""
+
+# Slides (optional).
+# Associate this project with Markdown slides.
+# Simply enter your slide deck's filename without extension.
+# E.g. `slides = "example-slides"` references `content/slides/example-slides.md`.
+# Otherwise, set `slides = ""`.
+slides: []
+---
diff --git a/content/project/cross/.DS_Store b/content/project/cross/.DS_Store
new file mode 100644
index 00000000000..5008ddfcf53
Binary files /dev/null and b/content/project/cross/.DS_Store differ
diff --git a/content/project/cross/featured.png b/content/project/cross/featured.png
new file mode 100644
index 00000000000..4ae1bdef780
Binary files /dev/null and b/content/project/cross/featured.png differ
diff --git a/content/project/cross/index.md b/content/project/cross/index.md
new file mode 100644
index 00000000000..7e949b1dcd7
--- /dev/null
+++ b/content/project/cross/index.md
@@ -0,0 +1,27 @@
+---
+title: "Center for Research in Open Source Software"
+summary: "Making a university infrastructure for bridging the gap between student research and open source projects."
+tags:
+- cross
+date: "2016-04-23T00:00:00Z"
+
+# Optional external URL for project (replaces project detail page).
+external_link: "https://cross.ucsc.edu"
+
+image:
+ caption: ""
+ focal_point: Smart
+
+links: []
+url_code: ""
+url_pdf: ""
+url_slides: ""
+url_video: ""
+
+# Slides (optional).
+# Associate this project with Markdown slides.
+# Simply enter your slide deck's filename without extension.
+# E.g. `slides = "example-slides"` references `content/slides/example-slides.md`.
+# Otherwise, set `slides = ""`.
+slides: []
+---
diff --git a/content/project/declstore/featured.png b/content/project/declstore/featured.png
new file mode 100644
index 00000000000..061f09facde
Binary files /dev/null and b/content/project/declstore/featured.png differ
diff --git a/content/project/declstore/index.md b/content/project/declstore/index.md
new file mode 100644
index 00000000000..8f4c9ff0ee1
--- /dev/null
+++ b/content/project/declstore/index.md
@@ -0,0 +1,36 @@
+---
+title: "Declarative Programmable Storage"
+summary: "Making programmable storage manageable."
+tags:
+- programmable
+- declarative
+date: "2016-04-27T00:00:00Z"
+
+# Optional external URL for project (replaces project detail page).
+external_link: ""
+
+image:
+ caption: ""
+ focal_point: Smart
+
+links: []
+url_code: ""
+url_pdf: ""
+url_slides: ""
+url_video: ""
+
+# Slides (optional).
+# Associate this project with Markdown slides.
+# Simply enter your slide deck's filename without extension.
+# E.g. `slides = "example-slides"` references `content/slides/example-slides.md`.
+# Otherwise, set `slides = ""`.
+slides: []
+---
+
+**Funding:** [NSF CNS-1764102](https://www.nsf.gov/awardsearch/showAward?AWD_ID=1764102)
+**Vision:** [USENIX HotStorage '17]({{< ref "/publication/watkins-hotstorage-17" >}})
+
+Large-scale storage systems are caught between a rock and a hard place. Below them, the hardware and software stacks are rapidly evolving, as new media such as solid-state drives and non-volatile memory disrupt traditional performance assumptions. It is more important than ever to future-proof applications and storage interfaces from dynamism and heterogeneity in the storage infrastructure. Meanwhile, above them, emerging classes of data-intensive applications continually demand new storage abstractions beyond the narrow waist of the POSIX IO API. Recent efforts have shown the promise of programmable storage –the principled reuse of existing subsystems exposed by the distributed infrastructure to enable new storage abstraction via composition. Unfortunately, in order to enable rapid application evolution, programmable storage forfeits protection from infrastructure evolution. This is because the composition of subsystems is a low-level task that couples (and hence obscures) a variety of orthogonal concerns, including functional correctness and performance. Building an interface by wiring together a collection of components typically requires thousands of lines of carefully-written C++ code, an effort that must be repeated whenever device or subsystem characteristics change.
+
+
+In this project, we explore a declarative approach to programmable storage. We observe that, much as was the case in the early days of relational databases, the rate of change of the application logic defining storage interfaces is dwarfed by the rate of change of the hardware and software infrastructure. We pose choosing an implementation that is consistent with the declarative functional specification as a search problem over the space of semantically-identical implementations, which can be reevaluated whenever device characteristics, software or workloads change.
\ No newline at end of file
diff --git a/content/project/eusocial-storage/.DS_Store b/content/project/eusocial-storage/.DS_Store
new file mode 100644
index 00000000000..5008ddfcf53
Binary files /dev/null and b/content/project/eusocial-storage/.DS_Store differ
diff --git a/content/project/eusocial-storage/featured.png b/content/project/eusocial-storage/featured.png
new file mode 100644
index 00000000000..0d82b992cf1
Binary files /dev/null and b/content/project/eusocial-storage/featured.png differ
diff --git a/content/project/eusocial-storage/index.md b/content/project/eusocial-storage/index.md
new file mode 100644
index 00000000000..59f560f2240
--- /dev/null
+++ b/content/project/eusocial-storage/index.md
@@ -0,0 +1,34 @@
+---
+title: "Eusocial Storage Devices"
+summary: "Making storage devices act collectively."
+tags:
+- programmable
+- eusocial
+- embedded
+date: "2016-04-27T00:00:00Z"
+
+# Optional external URL for project (replaces project detail page).
+external_link: ""
+
+image:
+ caption: ""
+ focal_point: Smart
+
+links: []
+url_code: ""
+url_pdf: ""
+url_slides: ""
+url_video: ""
+
+# Slides (optional).
+# Associate this project with Markdown slides.
+# Simply enter your slide deck's filename without extension.
+# E.g. `slides = "example-slides"` references `content/slides/example-slides.md`.
+# Otherwise, set `slides = ""`.
+slides: []
+---
+
+**Funding:** [CROSS](https://cross.ucsc.edu/)
+**Overview:** [USENIX ;login: Summer 2018](https://drive.google.com/file/d/1WzEZr0Xzn0c7ke3Mpt8mqa1J_tEU58-D/view?usp=sharing)
+
+As storage devices get faster, data management tasks rob the host of CPU cycles and DDR bandwidth. In this project, we examine a new interface to storage devices that can leverage existing and new CPU and DRAM resources to take over data management tasks like availability, recovery, and migrations *that span many storage devices*. This new interface provides a roadmap for device- to-device interactions and more powerful storage devices capable of providing in-store compute services that can dramatically improve performance. We call such storage devices “eusocial” because we are inspired by eusocial insects like ants, termites, and bees, which as individuals are primitive but collectively accomplish amazing things.
\ No newline at end of file
diff --git a/content/project/metadata-rich/.DS_Store b/content/project/metadata-rich/.DS_Store
new file mode 100644
index 00000000000..5008ddfcf53
Binary files /dev/null and b/content/project/metadata-rich/.DS_Store differ
diff --git a/content/project/metadata-rich/featured.png b/content/project/metadata-rich/featured.png
new file mode 100644
index 00000000000..36f743254c6
Binary files /dev/null and b/content/project/metadata-rich/featured.png differ
diff --git a/content/project/metadata-rich/index.md b/content/project/metadata-rich/index.md
new file mode 100644
index 00000000000..8ae981b10d1
--- /dev/null
+++ b/content/project/metadata-rich/index.md
@@ -0,0 +1,36 @@
+---
+title: "Metadata-Rich File Systems"
+summary: "Making file systems queryable."
+tags:
+- metadata
+date: "2016-04-20T00:00:00Z"
+
+# Optional external URL for project (replaces project detail page).
+external_link: ""
+
+image:
+ caption: ""
+ focal_point: Smart
+
+links: []
+url_code: ""
+url_pdf: ""
+url_slides: ""
+url_video: ""
+
+# Slides (optional).
+# Associate this project with Markdown slides.
+# Simply enter your slide deck's filename without extension.
+# E.g. `slides = "example-slides"` references `content/slides/example-slides.md`.
+# Otherwise, set `slides = ""`.
+slides: []
+---
+**Student:** Sasha Ames
+**Collaborator:** Maya Gokhale (LLNL)
+**Funding:** PDSI, LLNL
+
+This project is a LLNL/UCSC collaboration: the goal is to design a scalable metadata-rich file system with database-like data management services. With such a file system scientist will be able to perform time-critical analysis over continually evolving, very large data sets.
+
+In the first phase we designed and implemented QUASAR, a path-based query language using the POSIX IO data model extended by relational links. We conducted a couple of data mining case studies where we compared the baseline architecture consisting of a database and a file system with our MRFS prototype. The QUASAR interface via its query language provides much easier access to large data sets than POSIX IO. MRFS' querying performance is significantly better than the baseline system due to QUASAR's hierarchical scoping.
+
+Challenges remain and we are in the process of addressing them: we are working on a scalable physical data model of QUASAR's logical data model, and we are designing a rich-metadata client cache to address small update overheads and metadata coherence.
diff --git a/content/project/ospo/featured.png b/content/project/ospo/featured.png
new file mode 100644
index 00000000000..b6eb6bf0992
Binary files /dev/null and b/content/project/ospo/featured.png differ
diff --git a/content/project/ospo/index.md b/content/project/ospo/index.md
new file mode 100644
index 00000000000..4fd411faf73
--- /dev/null
+++ b/content/project/ospo/index.md
@@ -0,0 +1,27 @@
+---
+title: "Open Source Program Office - UC Santa Cruz"
+summary: "Making a university infrastructure for amplifying research impact."
+tags:
+- ospo
+date: "2022-05-10T00:00:00Z"
+
+# Optional external URL for project (replaces project detail page).
+external_link: "https://ospo.ucsc.edu"
+
+image:
+ caption: ""
+ focal_point: Smart
+
+links: []
+url_code: ""
+url_pdf: ""
+url_slides: ""
+url_video: ""
+
+# Slides (optional).
+# Associate this project with Markdown slides.
+# Simply enter your slide deck's filename without extension.
+# E.g. `slides = "example-slides"` references `content/slides/example-slides.md`.
+# Otherwise, set `slides = ""`.
+slides: []
+---
diff --git a/content/project/packaging/featured.png b/content/project/packaging/featured.png
new file mode 100644
index 00000000000..ba3f4fbb36b
Binary files /dev/null and b/content/project/packaging/featured.png differ
diff --git a/content/project/packaging/index.md b/content/project/packaging/index.md
new file mode 100644
index 00000000000..ba6e64bcbbb
--- /dev/null
+++ b/content/project/packaging/index.md
@@ -0,0 +1,34 @@
+---
+title: "Reproducible Dependency Management"
+summary: "Making builds reproducible."
+tags:
+- reproducibility
+date: "2022-05-09T00:00:00Z"
+
+# Optional external URL for project (replaces project detail page).
+external_link: ""
+
+image:
+ caption: ""
+ focal_point: Smart
+
+links: []
+url_code: ""
+url_pdf: ""
+url_slides: ""
+url_video: ""
+
+# Slides (optional).
+# Associate this project with Markdown slides.
+# Simply enter your slide deck's filename without extension.
+# E.g. `slides = "example-slides"` references `content/slides/example-slides.md`.
+# Otherwise, set `slides = ""`.
+slides: []
+---
+
+
+**Overview:** [SC'22 paper]({{< ref "/publication/zakaria-sc-22" >}})
+**Important Links:** [Farid Zakaria's blog](https://fzakaria.com/), [LLNL's BUILD project](https://computing.llnl.gov/projects/build)
+**Outside collaborators:** Todd Gamblin (LLNL), Tom Scogland (LLNL)
+
+Software stacks have become complex, with the dependencies of some applications numbering in the hundreds. Packaging, distributing, and administering software stacks of that scale is a complex undertaking anywhere. In increasingly heterogenous architectures today's systems deal with esoteric compilers, hardware, and a panoply of uncommon combinations. We explore the mechanisms available for packaging software to find its own dependencies in the context of a taxonomy of software distribution, and discuss their benefits and pitfalls. A goal of this project is to make builds reproducible *and* portable *and* performance-optimized, removing a key barrier to entry for the next generation of computer experimentalists in class rooms and in research labs.
\ No newline at end of file
diff --git a/content/project/practical-reproducibility/.DS_Store b/content/project/practical-reproducibility/.DS_Store
new file mode 100644
index 00000000000..5008ddfcf53
Binary files /dev/null and b/content/project/practical-reproducibility/.DS_Store differ
diff --git a/content/project/practical-reproducibility/featured.png b/content/project/practical-reproducibility/featured.png
new file mode 100644
index 00000000000..921ff1f3792
Binary files /dev/null and b/content/project/practical-reproducibility/featured.png differ
diff --git a/content/project/practical-reproducibility/index.md b/content/project/practical-reproducibility/index.md
new file mode 100644
index 00000000000..e30d123f4df
--- /dev/null
+++ b/content/project/practical-reproducibility/index.md
@@ -0,0 +1,38 @@
+---
+title: "Reproducible Evaluation of Systems"
+summary: "Making delivery of systems research more efficient."
+tags:
+- reproducibility
+date: "2016-04-27T00:00:00Z"
+
+# Optional external URL for project (replaces project detail page).
+external_link: ""
+
+image:
+ caption: ""
+ focal_point: Smart
+
+links: []
+url_code: ""
+url_pdf: ""
+url_slides: ""
+url_video: ""
+
+# Slides (optional).
+# Associate this project with Markdown slides.
+# Simply enter your slide deck's filename without extension.
+# E.g. `slides = "example-slides"` references `content/slides/example-slides.md`.
+# Otherwise, set `slides = ""`.
+slides: []
+---
+
+**Website:** [getpopper.io](https://getpopper.io)
+**Funding:** [NSF OAC-1450488](http://bigweatherweb.org/Big_Weather_Web/Home/Home.html), [NSF OAC-1836650](https://www.nsf.gov/awardsearch/showAward?AWD_ID=1836650), [CROSS](https://cross.ucsc.edu/)
+**Overview:** [USENIX ;login: Winter 2016](https://drive.google.com/file/d/0B5rZ7hI6vXv3bHlxdEpIMkphS0U/view?usp=sharing)
+**Workshops:**
+
+- [1st International Workshop on Practical Reproducible Evaluation of Computer Systems](https://p-recs.github.io/2018/) (P-RECS 2018) held in conjunction with ACM HPDC 2018.
+- [2nd International Workshop on Practical Reproducible Evaluation of Computer Systems](https://p-recs.github.io/2019/) (P-RECS 2019) held in conjunction with ACM HPDC 2019.
+- [3rd International Workshop on Practical Reproducible Evaluation of Computer Systems](https://p-recs.github.io/2020/) (P-RECS 2020) held in conjunction with ACM HPDC 2020.
+
+Independently validating experimental results in the field of computer systems research is a challenging task. Recreating an environment that resembles the one where an experiment was originally executed is a time-consuming endeavor. Popper is a convention (or protocol) for conducting experiments following a DevOps approach that allows researchers to make all associated artifacts publicly available with the goal of maximizing automation in the re-execution of an experiment and validation of its results.
diff --git a/content/project/programmable-storage/.DS_Store b/content/project/programmable-storage/.DS_Store
new file mode 100644
index 00000000000..5008ddfcf53
Binary files /dev/null and b/content/project/programmable-storage/.DS_Store differ
diff --git a/content/project/programmable-storage/featured.png b/content/project/programmable-storage/featured.png
new file mode 100644
index 00000000000..7d820a827b7
Binary files /dev/null and b/content/project/programmable-storage/featured.png differ
diff --git a/content/project/programmable-storage/index.md b/content/project/programmable-storage/index.md
new file mode 100644
index 00000000000..13bf8029edb
--- /dev/null
+++ b/content/project/programmable-storage/index.md
@@ -0,0 +1,34 @@
+---
+title: "Programmable Storage Systems"
+summary: "Making storage abstractions scale over heterogeneous storage media."
+tags:
+- programmable
+date: "2016-04-27T00:00:00Z"
+
+# Optional external URL for project (replaces project detail page).
+external_link: ""
+
+image:
+ caption: ""
+ focal_point: Smart
+
+links: []
+url_code: ""
+url_pdf: ""
+url_slides: ""
+url_video: ""
+
+# Slides (optional).
+# Associate this project with Markdown slides.
+# Simply enter your slide deck's filename without extension.
+# E.g. `slides = "example-slides"` references `content/slides/example-slides.md`.
+# Otherwise, set `slides = ""`.
+slides: []
+---
+
+**Website:** [programmability.us](http://programmability.us)
+**Press Coverage:** [The Next Platform (8/1/17)](https://www.nextplatform.com/2017/08/01/fresh-thinking-programmable-storage/amp/)
+**Funding:** [CROSS](https://cross.ucsc.edu/), [DOE SSIO SIRIUS](https://extremescaleresearch.labworks.org/projects/sirius), [NSF CNS-1705021](https://www.nsf.gov/awardsearch/showAward?AWD_ID=1705021&HistoricalAwards=false)
+**Workshop:** [1st Programmable File Systems Workshop](http://www.cs.ucsc.edu/~carlosm/PFSW/Home.html) (PFSW) held in conjunction with ACM HPDC’14
+
+A programmable storage system exposes internal subsystem abstractions as “interfaces” to enable the creation of higher-level services via composition. Malacology is a programmable storage system that enables the programmability of internal abstractions in Ceph. Using Malacology, we built the Mantle and ZLog services.
\ No newline at end of file
diff --git a/content/project/research-statement-2009/featured.png b/content/project/research-statement-2009/featured.png
new file mode 100644
index 00000000000..5e8b358ac5b
Binary files /dev/null and b/content/project/research-statement-2009/featured.png differ
diff --git a/content/project/research-statement-2009/index.md b/content/project/research-statement-2009/index.md
new file mode 100644
index 00000000000..6e351c1c7ba
--- /dev/null
+++ b/content/project/research-statement-2009/index.md
@@ -0,0 +1,26 @@
+---
+title: "Research Statement (2009)"
+summary: "Making Associate Adjunct in 2010."
+tags: []
+date: "2016-04-15T00:00:00Z"
+
+# Optional external URL for project (replaces project detail page).
+external_link: "https://users.soe.ucsc.edu/~carlosm/UCSC/Research_Statement.html"
+
+image:
+ caption: ""
+ focal_point: Smart
+
+links: []
+url_code: ""
+url_pdf: ""
+url_slides: ""
+url_video: ""
+
+# Slides (optional).
+# Associate this project with Markdown slides.
+# Simply enter your slide deck's filename without extension.
+# E.g. `slides = "example-slides"` references `content/slides/example-slides.md`.
+# Otherwise, set `slides = ""`.
+slides: []
+---
diff --git a/content/project/skyhook/featured.png b/content/project/skyhook/featured.png
new file mode 100644
index 00000000000..c76016e894b
Binary files /dev/null and b/content/project/skyhook/featured.png differ
diff --git a/content/project/skyhook/index.md b/content/project/skyhook/index.md
new file mode 100644
index 00000000000..844dd255156
--- /dev/null
+++ b/content/project/skyhook/index.md
@@ -0,0 +1,51 @@
+---
+title: "SkyhookDM"
+summary: "Making storage & network layers manage data."
+tags:
+- programmable
+- eusocial
+- skyhook
+- smartnics
+- arrow
+date: "2022-05-09T00:00:00Z"
+last_update: 2023-08-03
+
+# Optional external URL for project (replaces project detail page).
+external_link: ""
+
+image:
+ caption: ""
+ focal_point: Smart
+
+links: []
+url_code: ""
+url_pdf: ""
+url_slides: ""
+url_video: ""
+
+# Slides (optional).
+# Associate this project with Markdown slides.
+# Simply enter your slide deck's filename without extension.
+# E.g. `slides = "example-slides"` references `content/slides/example-slides.md`.
+# Otherwise, set `slides = ""`.
+slides: []
+---
+
+**Websites:** [Skyhook Data Management](https://skyhookdm.github.io/), [IRIS-HEP project](https://iris-hep.org/projects/skyhookdm.html)
+**Funding:** [NSF TI-2229773](https://www.nsf.gov/awardsearch/showAward?AWD_ID=2229773), DOE ASCR DE-NA0003525 (FWP 20-023266): UCSC subcontractor of Sandia National Labs, [NSF OAC-1836650](https://www.nsf.gov/awardsearch/showAward?AWD_ID=1836650), [NSF CNS-1764102](https://www.nsf.gov/awardsearch/showAward?AWD_ID=1764102), [NSF CNS-1705021](https://www.nsf.gov/awardsearch/showAward?AWD_ID=1705021), and [CROSS](https://cross.ucsc.edu/)
+**Overviews:** [CCGrid'22 paper]({{< ref "/publication/chakraborty-ccgrid-22" >}}), [COMPSYS'23 paper]({{< ref "/publication/ulmer-compsys-23" >}})
+**Important Links:** [GitHub repository](https://github.com/skyhookdm/), [Ceph plugin repository](https://github.com/apache/arrow/tree/master/cpp/src/skyhook), getting started [instructions](https://skyhookdm-arrow.readthedocs.io/en/latest/getting_started.html) and [notebook](https://github.com/uccross/arrow/blob/rados-dataset-dev/cpp/src/arrow/adapters/arrow-rados-cls/docs/demo.ipynb), code walkthrough [video](https://www.youtube.com/watch?v=XfJsnadp18c).
+
+The key advantage of the cloud is its elasticity. This is implemented by systems that can expand and shrink resources quickly and by disaggregation services, including compute, networking, and storage. Elasticity is also valuable for on-premise datacenters where disaggregation allows compute and storage to scale independently. This disaggregation however places greater demand on expensive top-of-rack networking resources since compute and storage nodes end up in different racks and even rows as the installation is growing. More network traffic also requires more CPU cycles to be dedicated to sending and receiving data. Therefore, disaggregation, somewhat paradoxically, amplifies the benefit of moving some compute -- the compute that involves data management -- into storage & network layers because data management filtering operations can reduce data movement significantly.
+
+Combining data management with storage and networking also creates the opportunity for new services that can help avoid dataset copies and thereby can significantly save storage space. Data management-enabled storage systems can provide views by combining parts of multiple datasets: columns from one table can be combined with columns from a different table without creating copies. For this to work, these storage systems need to store sufficient metadata and naming conventions about datasets. This makes them a natural place for maintaining this metadata and servicing it to other tools in convenient formats.
+
+## The Apache Arrow Ceph Plugin
+
+Skyhook Data Management consists of multiple subprojects at different stages of maturity, spanning storage and networking. The most mature subproject is the Apache Arrow Ceph plugin, an extension of the [Ceph open source distributed storage system](https://ceph.io/) for the scalable storage of tables and for offloading common data management operations on them, including selection, projection, aggregation, and indexing, as well as user-defined functions (see [Apache Arrow blog post](https://arrow.apache.org/blog/2022/01/31/skyhook-bringing-computation-to-storage-with-apache-arrow/) and [github repository](https://github.com/apache/arrow/tree/master/cpp/src/skyhook)). The goal of Apache Arrow Ceph Plugin is to transparently scale out data management operations across many storage servers leveraging the scale-out and availability properties of Ceph while significantly reducing the use of CPU cycles and interconnect bandwidth for unnecessary data transfers. The SkyhookDM architecture is also designed to transparently optimize for future storage devices of increasing heterogeneity and specialization. All the data movements from the Ceph OSDs to the client are using the [Apache Arrow](https://arrow.apache.org/) format.
+
+## For more information ...
+
+See the [Skyhook Data Management website](https://skyhookdm.github.io/) and [GitHub repositories](https://github.com/skyhookdm/).
+
+SkyhookDM is currently an incubator project at the [Center for Research on Open Source Software](https://cross.ucsc.edu) at the [University of California Santa Cruz](https://ucsc.edu).
diff --git a/content/project/smartnic/featured.png b/content/project/smartnic/featured.png
new file mode 100644
index 00000000000..5454451f046
Binary files /dev/null and b/content/project/smartnic/featured.png differ
diff --git a/content/project/smartnic/index.md b/content/project/smartnic/index.md
new file mode 100644
index 00000000000..fc36820ce92
--- /dev/null
+++ b/content/project/smartnic/index.md
@@ -0,0 +1,39 @@
+---
+title: "Offloading Data Management Services to Smart NICs"
+summary: "Making Smart NICs manage data."
+tags:
+- programmable
+- eusocial
+- smartnics
+- skyhook
+date: "2021-01-20T00:00:00Z"
+
+# Optional external URL for project (replaces project detail page).
+external_link: ""
+
+image:
+ caption: ""
+ focal_point: Smart
+
+links: []
+url_code: ""
+url_pdf: ""
+url_slides: ""
+url_video: ""
+
+# Slides (optional).
+# Associate this project with Markdown slides.
+# Simply enter your slide deck's filename without extension.
+# E.g. `slides = "example-slides"` references `content/slides/example-slides.md`.
+# Otherwise, set `slides = ""`.
+slides: []
+---
+
+**Funding:** DOE ASCR DE-NA0003525 (FWP 20-023266): UCSC subcontractor of Sandia National Labs.
+**Press Coverage:** [The Next Platform (5/24/21)](https://www.nextplatform.com/2021/05/24/testing-the-limits-of-the-bluefield-2-smartnic/)
+
+The responsibilities of the I/O subsystem in high-performance computing (HPC) platforms have grown significantly over the last decade. In addition to delivering high-bandwidth, high-volume storage for results and checkpoints, the platform is expected to provide a variety of data management and storage services (DMSSes) that have become an essential part of users’ workflows. These services include lightweight key/value stores for aggregating state, in-memory object stores for data handoffs between workflow applications, I/O libraries that manage shared state for complex structured data, and programmable storage frameworks that generate live annotations of simulations.
+
+The importance of these DMSSes has driven the scalable I/O community to re-evaluate how services are architected and deployed in modern HPC platforms. Rather than build fixed, system-level services (e.g., burst buffers), many researchers embrace flexible, user-level services that are co-scheduled with simulations. Current research advocates a “composable service” model where a small number of communication components are used to create services that are highly customized to a workflow’s requirements. This approach provides users with freedom to specify when and where their DMSSes run on a platform. A valid criticism of this work is that current architectures lack an optimal location for hosting these services. Researchers have explored hosting services in the simulation’s compute nodes, supplemental compute nodes, the storage system, and “bump-in-the-wire” network hardware. These approaches either steal resources from the simulation, increase network congestion, or are impractical due to security or cost.
+
+In this project we advocate an alternative approach: embed DMSSes in Smart Network Interface Cards (Smart NICs) located in the compute nodes. Emerging Smart NICs such as the Mellanox BlueField card supplement a traditional NIC with processing and memory resources that are user programmable. Transitioning a composable service library to function on these Smart NICs places services in close proximity to simulations without consuming host resources. The proposed work seeks to resolve fundamental challenges that arise from this architectural change and evaluate how well DMSSes perform in an environment that mixes compute nodes and Smart NICs.
diff --git a/content/project/storage-simulation/featured.png b/content/project/storage-simulation/featured.png
new file mode 100644
index 00000000000..b3a9786b402
Binary files /dev/null and b/content/project/storage-simulation/featured.png differ
diff --git a/content/project/storage-simulation/index.md b/content/project/storage-simulation/index.md
new file mode 100644
index 00000000000..f3387020919
--- /dev/null
+++ b/content/project/storage-simulation/index.md
@@ -0,0 +1,39 @@
+---
+title: "Scalable Storage System Simulation"
+summary: "Making storage system models."
+tags:
+- simulation
+date: "2016-04-21T00:00:00Z"
+
+# Optional external URL for project (replaces project detail page).
+external_link: ""
+
+image:
+ caption: ""
+ focal_point: Smart
+
+links: []
+url_code: ""
+url_pdf: ""
+url_slides: ""
+url_video: ""
+
+# Slides (optional).
+# Associate this project with Markdown slides.
+# Simply enter your slide deck's filename without extension.
+# E.g. `slides = "example-slides"` references `content/slides/example-slides.md`.
+# Otherwise, set `slides = ""`.
+slides: []
+---
+**Students:** Adam Crume, Esteban Molina-Estolano
+**Collaborators:** Matthew Curry (SNL), Thomas Kroeger (SNL), Lee Ward (SNL), Rob Ross (ANL), Christopher Carothers (RPI), John Bent (LANL), Gary Grider (LANL), James Nunez (LANL), Scott Brandt (UCSC), Kleoni Ioannidou (UCSC)
+**Funding:** DOE, PDSI, ISSDM, GAANN
+
+This project, a LANL/UCSC collaboration, has already created strong interest at labs and universities: the goal is to create a simulator for parallel file systems. Such a simulator will
+
+1. enable file system designers and researchers to try out innovative data placement strategies and other novel subsystems at scale,
+2. facilitate file system deployment by providing a low-cost platform for "what-if" workload and file system tuning scenarios,
+3. empower scientist to quickly tune existing file systems for specific workloads,
+4. aid instructors by providing a platform for class room experiments.
+
+In the first phase we started with building the simulator based on a very simple model of parallel file systems and a set of placement strategies from commonly used systems (so far: PVFS, PanFS, and Ceph). We are in the process of validating the simulator by replaying traces collected by the PDSI, LLNL, and at industry. The validation involves a careful and disciplined process of adding and removing features to the simulator's file system model to arrive at the minimal set of features necessary to reproduce real system's behavior.
diff --git a/content/publication/.DS_Store b/content/publication/.DS_Store
new file mode 100644
index 00000000000..6b74ea6ec58
Binary files /dev/null and b/content/publication/.DS_Store differ
diff --git a/content/publication/.gitignore b/content/publication/.gitignore
new file mode 100644
index 00000000000..a1363379944
--- /dev/null
+++ b/content/publication/.gitignore
@@ -0,0 +1 @@
+*.pdf
diff --git a/content/publication/.gitignore~ b/content/publication/.gitignore~
new file mode 100644
index 00000000000..c9c16aac70a
--- /dev/null
+++ b/content/publication/.gitignore~
@@ -0,0 +1 @@
+.pdf
diff --git a/content/publication/ames-fast-10-poster/cite.bib b/content/publication/ames-fast-10-poster/cite.bib
new file mode 100644
index 00000000000..945ba6a2838
--- /dev/null
+++ b/content/publication/ames-fast-10-poster/cite.bib
@@ -0,0 +1,13 @@
+@inproceedings{ames:fast10poster,
+ address = {San Jose, CA},
+ author = {Sasha Ames and Maya B. Gokhale and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAXQS9hbWVzLWZhc3QxMHBvc3Rlci5wZGZPEQF0AAAAAAF0AAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8VYW1lcy1mYXN0MTBwb3N0ZXIucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAUEAAAIAPS86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6QTphbWVzLWZhc3QxMHBvc3Rlci5wZGYAAA4ALAAVAGEAbQBlAHMALQBmAGEAcwB0ADEAMABwAG8AcwB0AGUAcgAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAKC9NeSBEcml2ZS9QYXBlcnMvQS9hbWVzLWZhc3QxMHBvc3Rlci5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAPgAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAG2},
+ booktitle = {Poster Session at the Conference on File and Storage Technology (FAST 2010)},
+ date-added = {2019-12-26 20:23:07 -0800},
+ date-modified = {2019-12-29 16:32:23 -0800},
+ keywords = {shortpapers, filesystems, linking, metadata},
+ month = {February 24-27},
+ title = {Design and Implementation of a Metadata-Rich File System},
+ year = {2010}
+}
+
diff --git a/content/publication/ames-fast-10-poster/index.md b/content/publication/ames-fast-10-poster/index.md
new file mode 100644
index 00000000000..36c72be5ce8
--- /dev/null
+++ b/content/publication/ames-fast-10-poster/index.md
@@ -0,0 +1,14 @@
+---
+title: "Design and Implementation of a Metadata-Rich File System"
+date: 2010-02-01
+publishDate: 2020-01-05T06:43:50.380037Z
+authors: ["Sasha Ames", "Maya B. Gokhale", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*Poster Session at the Conference on File and Storage Technology (FAST 2010)*"
+tags: ["shortpapers", "filesystems", "linking", "metadata"]
+projects:
+- metadata-rich
+---
+
diff --git a/content/publication/ames-mss-05/cite.bib b/content/publication/ames-mss-05/cite.bib
new file mode 100644
index 00000000000..c554c88446e
--- /dev/null
+++ b/content/publication/ames-mss-05/cite.bib
@@ -0,0 +1,15 @@
+@inproceedings{ames:mss05,
+ abstract = {Traditional file systems provide a weak and inadequate structure for meaningful representations of file interrelationships and other context-providing metadata. Existing designs, which store additional file-oriented metadata either in a database, on disk, or both are limited by the technologies upon which they depend. Moreover, they do not provide for user-defined relationships among files. To address these issues, we created the Linking File System (LiFS), a file system design in which files may have both arbitrary user- or application-specified attributes, and attributed links between files. In order to assure performance when accessing links and attributes, the system is designed to store metadata in non-volatile memory. This paper discusses several use cases that take advantage of this approach and describes the user-space prototype we developed to test the concepts presented.},
+ address = {Monterey, CA},
+ author = {Alexander Ames and Nikhil Bobb and Scott A. Brandt and Adam Hiatt and Carlos Maltzahn and Ethan L. Miller and Alisa Neeman and Deepa Tuteja},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAQQS9hbWVzLW1zczA1LnBkZk8RAVgAAAAAAVgAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////w5hbWVzLW1zczA1LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABQQAAAgA2LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpBOmFtZXMtbXNzMDUucGRmAA4AHgAOAGEAbQBlAHMALQBtAHMAcwAwADUALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACEvTXkgRHJpdmUvUGFwZXJzL0EvYW1lcy1tc3MwNS5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkADcAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABkw==},
+ booktitle = {MSST '05},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:21:32 -0700},
+ keywords = {papers, ssrc, metadata, filesystems, linking},
+ local-url = {/Users/carlosmalt/Documents/Papers/ames-mss05.pdf},
+ month = {April},
+ title = {Richer File System Metadata Using Links and Attributes},
+ year = {2005}
+}
+
diff --git a/content/publication/ames-mss-05/index.md b/content/publication/ames-mss-05/index.md
new file mode 100644
index 00000000000..e889e164083
--- /dev/null
+++ b/content/publication/ames-mss-05/index.md
@@ -0,0 +1,12 @@
+---
+title: "Richer File System Metadata Using Links and Attributes"
+date: 2005-04-01
+publishDate: 2020-01-05T13:33:06.017729Z
+authors: ["Alexander Ames", "Nikhil Bobb", "Scott A. Brandt", "Adam Hiatt", "Carlos Maltzahn", "Ethan L. Miller", "Alisa Neeman", "Deepa Tuteja"]
+publication_types: ["1"]
+abstract: "Traditional file systems provide a weak and inadequate structure for meaningful representations of file interrelationships and other context-providing metadata. Existing designs, which store additional file-oriented metadata either in a database, on disk, or both are limited by the technologies upon which they depend. Moreover, they do not provide for user-defined relationships among files. To address these issues, we created the Linking File System (LiFS), a file system design in which files may have both arbitrary user- or application-specified attributes, and attributed links between files. In order to assure performance when accessing links and attributes, the system is designed to store metadata in non-volatile memory. This paper discusses several use cases that take advantage of this approach and describes the user-space prototype we developed to test the concepts presented."
+featured: false
+publication: "*MSST '05*"
+tags: ["papers", "ssrc", "metadata", "filesystems", "linking"]
+---
+
diff --git a/content/publication/ames-mss-06/cite.bib b/content/publication/ames-mss-06/cite.bib
new file mode 100644
index 00000000000..a68902c063b
--- /dev/null
+++ b/content/publication/ames-mss-06/cite.bib
@@ -0,0 +1,16 @@
+@inproceedings{ames:mss06,
+ abstract = {As the number and variety of files stored and accessed by a typical user has dramatically increased, existing file system structures have begun to fail as a mechanism for managing all of the information contained in those files. Many applications---email clients, multimedia management applications, and desktop search engines are examples--- have been forced to develop their own richer metadata infrastructures. While effective, these solutions are generally non-standard, non-portable, non-sharable across applications, users or platforms, proprietary, and potentially inefficient. In the interest of providing a rich, efficient, shared file system metadata infrastructure, we have developed the Linking File System (LiFS). Taking advantage of non-volatile storage class memories, LiFS supports a wide variety of user and application metadata needs while efficiently supporting traditional file system operations.},
+ address = {College Park, MD},
+ author = {Sasha Ames and Nikhil Bobb and Kevin M. Greenan and Owen S. Hofmann and Mark W. Storer and Carlos Maltzahn and Ethan L. Miller and Scott A. Brandt},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAQQS9hbWVzLW1zczA2LnBkZk8RAVgAAAAAAVgAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////w5hbWVzLW1zczA2LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABQQAAAgA2LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpBOmFtZXMtbXNzMDYucGRmAA4AHgAOAGEAbQBlAHMALQBtAHMAcwAwADYALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACEvTXkgRHJpdmUvUGFwZXJzL0EvYW1lcy1tc3MwNi5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkADcAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABkw==},
+ booktitle = {MSST '06},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:15:24 -0700},
+ keywords = {papers, linking, systems, storage, metadata, storagemedium, related:quasar, filesystems},
+ local-url = {/Users/carlosmalt/Documents/Papers/ames-mss06.pdf},
+ month = {May},
+ organization = {IEEE},
+ title = {LiFS: An Attribute-Rich File System for Storage Class Memories},
+ year = {2006}
+}
+
diff --git a/content/publication/ames-mss-06/index.md b/content/publication/ames-mss-06/index.md
new file mode 100644
index 00000000000..96dc0bd16dd
--- /dev/null
+++ b/content/publication/ames-mss-06/index.md
@@ -0,0 +1,14 @@
+---
+title: "LiFS: An Attribute-Rich File System for Storage Class Memories"
+date: 2006-05-01
+publishDate: 2020-01-05T13:33:06.010088Z
+authors: ["Sasha Ames", "Nikhil Bobb", "Kevin M. Greenan", "Owen S. Hofmann", "Mark W. Storer", "Carlos Maltzahn", "Ethan L. Miller", "Scott A. Brandt"]
+publication_types: ["1"]
+abstract: "As the number and variety of files stored and accessed by a typical user has dramatically increased, existing file system structures have begun to fail as a mechanism for managing all of the information contained in those files. Many applications---email clients, multimedia management applications, and desktop search engines are examples--- have been forced to develop their own richer metadata infrastructures. While effective, these solutions are generally non-standard, non-portable, non-sharable across applications, users or platforms, proprietary, and potentially inefficient. In the interest of providing a rich, efficient, shared file system metadata infrastructure, we have developed the Linking File System (LiFS). Taking advantage of non-volatile storage class memories, LiFS supports a wide variety of user and application metadata needs while efficiently supporting traditional file system operations."
+featured: false
+publication: "*MSST '06*"
+tags: ["papers", "linking", "systems", "storage", "metadata", "storagemedium", "related:quasar", "filesystems"]
+projects:
+- metadata-rich
+---
+
diff --git a/content/publication/ames-nas-11/cite.bib b/content/publication/ames-nas-11/cite.bib
new file mode 100644
index 00000000000..6a8538d3fb2
--- /dev/null
+++ b/content/publication/ames-nas-11/cite.bib
@@ -0,0 +1,12 @@
+@inproceedings{ames:nas11,
+ address = {Dalian, China},
+ author = {Sasha Ames and Maya B. Gokhale and Carlos Maltzahn},
+ booktitle = {NAS 2011},
+ date-added = {2011-05-26 23:15:19 -0700},
+ date-modified = {2011-05-26 23:17:11 -0700},
+ keywords = {papers, metadata, graphs, linking, filesystems},
+ month = {July 28-30},
+ title = {QMDS: A File System Metadata Management Service Supporting a Graph Data Model-based Query Language},
+ year = {2011}
+}
+
diff --git a/content/publication/ames-nas-11/index.md b/content/publication/ames-nas-11/index.md
new file mode 100644
index 00000000000..58622df7f3c
--- /dev/null
+++ b/content/publication/ames-nas-11/index.md
@@ -0,0 +1,14 @@
+---
+title: "QMDS: A File System Metadata Management Service Supporting a Graph Data Model-based Query Language"
+date: 2011-07-01
+publishDate: 2020-01-05T13:33:05.982083Z
+authors: ["Sasha Ames", "Maya B. Gokhale", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*NAS 2011*"
+tags: ["papers", "metadata", "graphs", "linking", "filesystems"]
+projects:
+- metadata-rich
+---
+
diff --git a/content/publication/ames-pdsw-10-poster/cite.bib b/content/publication/ames-pdsw-10-poster/cite.bib
new file mode 100644
index 00000000000..52a8304ef0d
--- /dev/null
+++ b/content/publication/ames-pdsw-10-poster/cite.bib
@@ -0,0 +1,13 @@
+@inproceedings{ames:pdsw10poster,
+ address = {New Orleans, LA},
+ author = {Sasha Ames and Maya B. Gokhale and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAXQS9hbWVzLXBkc3cxMHBvc3Rlci5wZGZPEQF0AAAAAAF0AAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8VYW1lcy1wZHN3MTBwb3N0ZXIucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAUEAAAIAPS86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6QTphbWVzLXBkc3cxMHBvc3Rlci5wZGYAAA4ALAAVAGEAbQBlAHMALQBwAGQAcwB3ADEAMABwAG8AcwB0AGUAcgAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAKC9NeSBEcml2ZS9QYXBlcnMvQS9hbWVzLXBkc3cxMHBvc3Rlci5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAPgAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAG2},
+ booktitle = {Session at 5th Petascale Data Storage Workshop (PDSW 2010), co-located with Supercomputing 2010},
+ date-added = {2019-12-26 20:05:01 -0800},
+ date-modified = {2019-12-29 16:32:49 -0800},
+ keywords = {shortpapers, linking, filesystems, metadata},
+ month = {November 15},
+ title = {QMDS: A File System Metadata Service Supporting a Graph Data Model-Based Query Language},
+ year = {2010}
+}
+
diff --git a/content/publication/ames-pdsw-10-poster/index.md b/content/publication/ames-pdsw-10-poster/index.md
new file mode 100644
index 00000000000..3ad5a4be1ce
--- /dev/null
+++ b/content/publication/ames-pdsw-10-poster/index.md
@@ -0,0 +1,14 @@
+---
+title: "QMDS: A File System Metadata Service Supporting a Graph Data Model-Based Query Language"
+date: 2010-11-01
+publishDate: 2020-01-05T06:43:50.382326Z
+authors: ["Sasha Ames", "Maya B. Gokhale", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*Session at 5th Petascale Data Storage Workshop (PDSW 2010), co-located with Supercomputing 2010*"
+tags: ["shortpapers", "linking", "filesystems", "metadata"]
+projects:
+- metadata-rich
+---
+
diff --git a/content/publication/ames-peds-12/cite.bib b/content/publication/ames-peds-12/cite.bib
new file mode 100644
index 00000000000..f00a377ba89
--- /dev/null
+++ b/content/publication/ames-peds-12/cite.bib
@@ -0,0 +1,14 @@
+@article{ames:peds12,
+ abstract = {File system metadata management has become a bottleneck for many data-intensive applications that rely on high-performance file systems. Part of the bottleneck is due to the limitations of an almost 50-year-old interface standard with metadata abstractions that were designed at a time when high-end file systems managed less than 100MB. Today's high-performance file systems store 7--9 orders of magnitude more data, resulting in a number of data items for which these metadata abstractions are inadequate, such as directory hierarchies unable to handle complex relationships among data. Users of file systems have attempted to work around these inadequacies by moving application-specific metadata management to relational databases to make metadata searchable. Splitting file system metadata management into two separate systems introduces inefficiencies and systems management problems. To address this problem, we propose QMDS: a file system metadata management service that integrates all file system metadata and uses a graph data model with attributes on nodes and edges. Our service uses a query language interface for file identification and attribute retrieval. We present our metadata management service design and architecture and study its performance using a text analysis benchmark application. Results from our QMDS prototype show the effectiveness of this approach. Compared to the use of a file system and relational database, the QMDS prototype shows superior performance for both ingest and query workloads.},
+ author = {Sasha Ames and Maya Gokhale and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxARQS9hbWVzLXBlZHMxMi5wZGZPEQFcAAAAAAFcAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8PYW1lcy1wZWRzMTIucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAUEAAAIANy86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6QTphbWVzLXBlZHMxMi5wZGYAAA4AIAAPAGEAbQBlAHMALQBwAGUAZABzADEAMgAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAIi9NeSBEcml2ZS9QYXBlcnMvQS9hbWVzLXBlZHMxMi5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAOAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGY},
+ date-added = {2012-02-27 18:02:43 +0000},
+ date-modified = {2020-01-05 05:32:03 -0700},
+ journal = {International Journal of Parallel, Emergent and Distributed Systems},
+ keywords = {papers, metadata, management, graphs, filesystems, datamanagement},
+ number = {2},
+ title = {QMDS: a file system metadata management service supporting a graph data model-based query language},
+ volume = {27},
+ year = {2012}
+}
+
diff --git a/content/publication/ames-peds-12/index.md b/content/publication/ames-peds-12/index.md
new file mode 100644
index 00000000000..29c6c38172b
--- /dev/null
+++ b/content/publication/ames-peds-12/index.md
@@ -0,0 +1,14 @@
+---
+title: "QMDS: a file system metadata management service supporting a graph data model-based query language"
+date: 2012-01-01
+publishDate: 2020-01-05T13:33:05.977930Z
+authors: ["Sasha Ames", "Maya Gokhale", "Carlos Maltzahn"]
+publication_types: ["2"]
+abstract: "File system metadata management has become a bottleneck for many data-intensive applications that rely on high-performance file systems. Part of the bottleneck is due to the limitations of an almost 50-year-old interface standard with metadata abstractions that were designed at a time when high-end file systems managed less than 100MB. Today's high-performance file systems store 7--9 orders of magnitude more data, resulting in a number of data items for which these metadata abstractions are inadequate, such as directory hierarchies unable to handle complex relationships among data. Users of file systems have attempted to work around these inadequacies by moving application-specific metadata management to relational databases to make metadata searchable. Splitting file system metadata management into two separate systems introduces inefficiencies and systems management problems. To address this problem, we propose QMDS: a file system metadata management service that integrates all file system metadata and uses a graph data model with attributes on nodes and edges. Our service uses a query language interface for file identification and attribute retrieval. We present our metadata management service design and architecture and study its performance using a text analysis benchmark application. Results from our QMDS prototype show the effectiveness of this approach. Compared to the use of a file system and relational database, the QMDS prototype shows superior performance for both ingest and query workloads."
+featured: false
+publication: "*International Journal of Parallel, Emergent and Distributed Systems*"
+tags: ["papers", "metadata", "management", "graphs", "filesystems", "datamanagement"]
+projects:
+- metadata-rich
+---
+
diff --git a/content/publication/ames-sosp-07/cite.bib b/content/publication/ames-sosp-07/cite.bib
new file mode 100644
index 00000000000..de59da2cfea
--- /dev/null
+++ b/content/publication/ames-sosp-07/cite.bib
@@ -0,0 +1,12 @@
+@inproceedings{ames:sosp07,
+ address = {Stevenson, WA},
+ author = {Sasha Ames and Carlos Maltzahn and Ethan L. Miller},
+ booktitle = {Poster Session at the 21st Symposium on Operating Systems Principles (SOSP 2007)},
+ date-added = {2009-09-29 12:08:09 -0700},
+ date-modified = {2019-12-29 16:44:11 -0800},
+ keywords = {shortpapers, metadata, filesystems, querying},
+ month = {October},
+ title = {A File System Query Language},
+ year = {2007}
+}
+
diff --git a/content/publication/ames-sosp-07/index.md b/content/publication/ames-sosp-07/index.md
new file mode 100644
index 00000000000..0a8463915c5
--- /dev/null
+++ b/content/publication/ames-sosp-07/index.md
@@ -0,0 +1,14 @@
+---
+title: "A File System Query Language"
+date: 2007-10-01
+publishDate: 2020-01-05T06:43:50.640287Z
+authors: ["Sasha Ames", "Carlos Maltzahn", "Ethan L. Miller"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*Poster Session at the 21st Symposium on Operating Systems Principles (SOSP 2007)*"
+tags: ["shortpapers", "metadata", "filesystems", "querying"]
+projects:
+- metadata-rich
+---
+
diff --git a/content/publication/ames-tr-0710/cite.bib b/content/publication/ames-tr-0710/cite.bib
new file mode 100644
index 00000000000..1ee0f3e5852
--- /dev/null
+++ b/content/publication/ames-tr-0710/cite.bib
@@ -0,0 +1,12 @@
+@techreport{ames:tr0710,
+ author = {Sasha Ames and Maya B. Gokhale and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxBCLi4vLi4vLi4vLi4vVXNlcnMvY2FybG9zbWFsdC9zdm4vcWZzL21zc3QxMC1xZnMvVUNTQy1TT0UtMTAtMDcucGRmTxEBcgAAAAABcgACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////ElVDU0MtU09FLTEwLTA3LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAD/////AAAKIGN1AAAAAAAAAAAAAAAAAAptc3N0MTAtcWZzAAIAOC86VXNlcnM6Y2FybG9zbWFsdDpzdm46cWZzOm1zc3QxMC1xZnM6VUNTQy1TT0UtMTAtMDcucGRmAA4AJgASAFUAQwBTAEMALQBTAE8ARQAtADEAMAAtADAANwAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIANlVzZXJzL2Nhcmxvc21hbHQvc3ZuL3Fmcy9tc3N0MTAtcWZzL1VDU0MtU09FLTEwLTA3LnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABpAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAd8=},
+ date-added = {2010-02-04 09:10:55 -0800},
+ date-modified = {2010-02-04 09:13:17 -0800},
+ institution = {UCSC},
+ month = {February},
+ number = {UCSC-SOE-10-07},
+ title = {Design and Implementation of a Metadata-Rich File System},
+ year = {2010}
+}
+
diff --git a/content/publication/ames-tr-0710/index.md b/content/publication/ames-tr-0710/index.md
new file mode 100644
index 00000000000..35875ac0fc2
--- /dev/null
+++ b/content/publication/ames-tr-0710/index.md
@@ -0,0 +1,11 @@
+---
+title: "Design and Implementation of a Metadata-Rich File System"
+date: 2010-02-01
+publishDate: 2020-01-05T06:43:50.602702Z
+authors: ["Sasha Ames", "Maya B. Gokhale", "Carlos Maltzahn"]
+publication_types: ["4"]
+abstract: ""
+featured: false
+publication: ""
+---
+
diff --git a/content/publication/amvrosiadis-nsfvision-18/cite.bib b/content/publication/amvrosiadis-nsfvision-18/cite.bib
new file mode 100644
index 00000000000..a5498d00cfa
--- /dev/null
+++ b/content/publication/amvrosiadis-nsfvision-18/cite.bib
@@ -0,0 +1,13 @@
+@unpublished{amvrosiadis:nsfvision18,
+ author = {George Amvrosiadis and Ali R. Butt and Vasily Tarasov and Ming Zhao and others},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA2Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0EvYW12cm9zaWFkaXMtbnNmdmlzaW9uMTgucGRmTxEBjAAAAAABjAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA368jeEJEAAH/////G2FtdnJvc2lhZGlzLW5zZnZpc2lvbjE4LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////X+gitAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFBAAACAEAvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkE6YW12cm9zaWFkaXMtbnNmdmlzaW9uMTgucGRmAA4AOAAbAGEAbQB2AHIAbwBzAGkAYQBkAGkAcwAtAG4AcwBmAHYAaQBzAGkAbwBuADEAOAAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAPlVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0EvYW12cm9zaWFkaXMtbnNmdmlzaW9uMTgucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAF0AAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAAB7Q==},
+ bdsk-url-1 = {https://www.overleaf.com/7988123186fbmpsqghjkgr},
+ date-added = {2023-01-13 13:20:46 -0800},
+ date-modified = {2023-01-13 13:20:46 -0800},
+ keywords = {papers, vision, storage, systems, research},
+ month = {May 30 - June 1},
+ note = {Report on NSF Visioning Workshop},
+ title = {Data Storage Research Vision 2025},
+ year = {2018}
+}
+
diff --git a/content/publication/amvrosiadis-nsfvision-18/index.md b/content/publication/amvrosiadis-nsfvision-18/index.md
new file mode 100644
index 00000000000..557871be85d
--- /dev/null
+++ b/content/publication/amvrosiadis-nsfvision-18/index.md
@@ -0,0 +1,12 @@
+---
+title: "Data Storage Research Vision 2025"
+date: 2018-05-01
+publishDate: 2023-01-26T14:23:16.860682Z
+authors: ["George Amvrosiadis", "Ali R. Butt", "Vasily Tarasov", "Ming Zhao", " others"]
+publication_types: ["3"]
+abstract: ""
+featured: false
+publication: ""
+tags: ["papers", "vision", "storage", "systems", "research"]
+---
+
diff --git a/content/publication/bhagwan-scc-09/cite.bib b/content/publication/bhagwan-scc-09/cite.bib
new file mode 100644
index 00000000000..034bdf6bdd0
--- /dev/null
+++ b/content/publication/bhagwan-scc-09/cite.bib
@@ -0,0 +1,13 @@
+@inproceedings{bhagwan:scc09,
+ address = {Bangalore, India},
+ author = {Varun Bhagwan and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAUQi92YmhhZ3dhbi1zY2MwOS5wZGZPEQFoAAAAAAFoAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8SdmJoYWd3YW4tc2NjMDkucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAUIAAAIAOi86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6Qjp2YmhhZ3dhbi1zY2MwOS5wZGYADgAmABIAdgBiAGgAYQBnAHcAYQBuAC0AcwBjAGMAMAA5AC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAlL015IERyaXZlL1BhcGVycy9CL3ZiaGFnd2FuLXNjYzA5LnBkZgAAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAOwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGn},
+ booktitle = {Work-In-Progress Session at 2009 IEEE International Conference on Services Computing (SCC 2009)},
+ date-added = {2019-12-29 16:11:09 -0800},
+ date-modified = {2019-12-29 16:11:52 -0800},
+ keywords = {shortpapers, crowdsourcing, metadata, filesystems},
+ month = {September 21--25},
+ title = {JabberWocky: Crowd-Sourcing Metadata for Files},
+ year = {2009}
+}
+
diff --git a/content/publication/bhagwan-scc-09/index.md b/content/publication/bhagwan-scc-09/index.md
new file mode 100644
index 00000000000..a753f26ddad
--- /dev/null
+++ b/content/publication/bhagwan-scc-09/index.md
@@ -0,0 +1,12 @@
+---
+title: "JabberWocky: Crowd-Sourcing Metadata for Files"
+date: 2009-09-01
+publishDate: 2020-01-05T06:43:50.378229Z
+authors: ["Varun Bhagwan", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*Work-In-Progress Session at 2009 IEEE International Conference on Services Computing (SCC 2009)*"
+tags: ["shortpapers", "crowdsourcing", "metadata", "filesystems"]
+---
+
diff --git a/content/publication/bhagwan-spe-12/cite.bib b/content/publication/bhagwan-spe-12/cite.bib
new file mode 100644
index 00000000000..2b2efdb1eed
--- /dev/null
+++ b/content/publication/bhagwan-spe-12/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{bhagwan:spe12,
+ abstract = {In healthcare, de-identification is fast becoming a service that is indispensable when medical data needs to be used for research and secondary use purposes. Currently, this process is done either manually, by human agent, or by an automated software algorithm. Both approaches have shortcomings. Here, we introduce a framework for enhancing the outcome of the current modes of executing a de-identification service. This paper presents the steps taken in conceiving and building a privacy framework and tool that improves the service of de-identification. Further, we test the usefulness and applicability of this system through a study with HIPAA-trained experts.},
+ address = {Honolulu, HI},
+ author = {Varun Bhagwan and Tyrone Grandison and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxATQi9iaGFnd2FuLXNwZTEyLnBkZk8RAWQAAAAAAWQAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xFiaGFnd2FuLXNwZTEyLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABQgAAAgA5LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpCOmJoYWd3YW4tc3BlMTIucGRmAAAOACQAEQBiAGgAYQBnAHcAYQBuAC0AcwBwAGUAMQAyAC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAkL015IERyaXZlL1BhcGVycy9CL2JoYWd3YW4tc3BlMTIucGRmABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkADoAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABog==},
+ booktitle = {IEEE 2012 Services Workshop on Security and Privacy Engineering (SPE2012)},
+ date-added = {2012-05-22 03:42:44 +0000},
+ date-modified = {2020-01-05 05:29:59 -0700},
+ keywords = {papers, privacy, humancomputation, healthcare},
+ month = {June},
+ title = {Recommendation-based De-Identification | A Practical Systems Approach towards De-identification of Unstructured Text in Healthcare},
+ year = {2012}
+}
+
diff --git a/content/publication/bhagwan-spe-12/index.md b/content/publication/bhagwan-spe-12/index.md
new file mode 100644
index 00000000000..803a72aaaba
--- /dev/null
+++ b/content/publication/bhagwan-spe-12/index.md
@@ -0,0 +1,12 @@
+---
+title: "Recommendation-based De-Identification | A Practical Systems Approach towards De-identification of Unstructured Text in Healthcare"
+date: 2012-06-01
+publishDate: 2020-01-05T13:33:05.971980Z
+authors: ["Varun Bhagwan", "Tyrone Grandison", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: "In healthcare, de-identification is fast becoming a service that is indispensable when medical data needs to be used for research and secondary use purposes. Currently, this process is done either manually, by human agent, or by an automated software algorithm. Both approaches have shortcomings. Here, we introduce a framework for enhancing the outcome of the current modes of executing a de-identification service. This paper presents the steps taken in conceiving and building a privacy framework and tool that improves the service of de-identification. Further, we test the usefulness and applicability of this system through a study with HIPAA-trained experts."
+featured: false
+publication: "*IEEE 2012 Services Workshop on Security and Privacy Engineering (SPE2012)*"
+tags: ["papers", "privacy", "humancomputation", "healthcare"]
+---
+
diff --git a/content/publication/bigelow-fast-08-wip/cite.bib b/content/publication/bigelow-fast-08-wip/cite.bib
new file mode 100644
index 00000000000..4d4815f80f6
--- /dev/null
+++ b/content/publication/bigelow-fast-08-wip/cite.bib
@@ -0,0 +1,13 @@
+@inproceedings{bigelow:fast08wip,
+ address = {San Jose, CA},
+ author = {David Bigelow and Scott A. Brandt and Carlos Maltzahn and Sage Weil},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAXQi9iaWdlbG93LWZhc3QwOHdpcC5wZGZPEQF0AAAAAAF0AAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8VYmlnZWxvdy1mYXN0MDh3aXAucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAUIAAAIAPS86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6QjpiaWdlbG93LWZhc3QwOHdpcC5wZGYAAA4ALAAVAGIAaQBnAGUAbABvAHcALQBmAGEAcwB0ADAAOAB3AGkAcAAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAKC9NeSBEcml2ZS9QYXBlcnMvQi9iaWdlbG93LWZhc3QwOHdpcC5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAPgAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAG2},
+ booktitle = {Work-in-Progress Session of the USENIX Conference on File and Storage Technology (FAST 2008)},
+ date-added = {2019-12-29 16:25:47 -0800},
+ date-modified = {2019-12-29 16:31:55 -0800},
+ keywords = {shortpapers, raid, objectstorage},
+ month = {February 26-29},
+ title = {Adapting RAID Methods for Use in Object Storage Systems},
+ year = {2008}
+}
+
diff --git a/content/publication/bigelow-fast-08-wip/index.md b/content/publication/bigelow-fast-08-wip/index.md
new file mode 100644
index 00000000000..a6e5499d620
--- /dev/null
+++ b/content/publication/bigelow-fast-08-wip/index.md
@@ -0,0 +1,12 @@
+---
+title: "Adapting RAID Methods for Use in Object Storage Systems"
+date: 2008-02-01
+publishDate: 2020-01-05T06:43:50.375992Z
+authors: ["David Bigelow", "Scott A. Brandt", "Carlos Maltzahn", "Sage Weil"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*Work-in-Progress Session of the USENIX Conference on File and Storage Technology (FAST 2008)*"
+tags: ["shortpapers", "raid", "objectstorage"]
+---
+
diff --git a/content/publication/bigelow-pdsw-07/cite.bib b/content/publication/bigelow-pdsw-07/cite.bib
new file mode 100644
index 00000000000..a23984483a7
--- /dev/null
+++ b/content/publication/bigelow-pdsw-07/cite.bib
@@ -0,0 +1,13 @@
+@inproceedings{bigelow:pdsw07,
+ abstract = {Many applications---for example, scientific simulation, real-time data acquisition, and distributed reservation systems---have I/O performance requirements, yet most large, distributed storage systems lack the ability to guarantee I/O performance. We are working on end-to-end performance management in scalable, distributed storage systems. The kinds of storage systems we are targeting include large high-performance computing (HPC) clusters, which require both large data volumes and high I/O rates, as well as large-scale general-purpose storage systems.},
+ address = {Reno, NV},
+ author = {David Bigelow and Suresh Iyer and Tim Kaldewey and Roberto Pineiro and Anna Povzner and Scott A. Brandt and Richard Golding and Theodore Wong and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAUQi9iaWdlbG93LXBkc3cwNy5wZGZPEQFoAAAAAAFoAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8SYmlnZWxvdy1wZHN3MDcucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAUIAAAIAOi86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6QjpiaWdlbG93LXBkc3cwNy5wZGYADgAmABIAYgBpAGcAZQBsAG8AdwAtAHAAZABzAHcAMAA3AC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAlL015IERyaXZlL1BhcGVycy9CL2JpZ2Vsb3ctcGRzdzA3LnBkZgAAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAOwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGn},
+ booktitle = {Proceedings of the 2007 ACM Petascale Data Storage Workshop (PDSW 07)},
+ date-added = {2009-09-29 12:08:09 -0700},
+ date-modified = {2020-01-05 05:56:32 -0700},
+ keywords = {papers, performance, management, distributed, storage, scalable},
+ title = {End-to-end Performance Management for Scalable Distributed Storage},
+ year = {2007}
+}
+
diff --git a/content/publication/bigelow-pdsw-07/index.md b/content/publication/bigelow-pdsw-07/index.md
new file mode 100644
index 00000000000..ce5f9bf4e4a
--- /dev/null
+++ b/content/publication/bigelow-pdsw-07/index.md
@@ -0,0 +1,12 @@
+---
+title: "End-to-end Performance Management for Scalable Distributed Storage"
+date: 2007-01-01
+publishDate: 2020-01-05T13:33:05.992567Z
+authors: ["David Bigelow", "Suresh Iyer", "Tim Kaldewey", "Roberto Pineiro", "Anna Povzner", "Scott A. Brandt", "Richard Golding", "Theodore Wong", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: "Many applications---for example, scientific simulation, real-time data acquisition, and distributed reservation systems---have I/O performance requirements, yet most large, distributed storage systems lack the ability to guarantee I/O performance. We are working on end-to-end performance management in scalable, distributed storage systems. The kinds of storage systems we are targeting include large high-performance computing (HPC) clusters, which require both large data volumes and high I/O rates, as well as large-scale general-purpose storage systems."
+featured: false
+publication: "*Proceedings of the 2007 ACM Petascale Data Storage Workshop (PDSW 07)*"
+tags: ["papers", "performance", "management", "distributed", "storage", "scalable"]
+---
+
diff --git a/content/publication/bobb-wdas-06/cite.bib b/content/publication/bobb-wdas-06/cite.bib
new file mode 100644
index 00000000000..bb81aa10d8b
--- /dev/null
+++ b/content/publication/bobb-wdas-06/cite.bib
@@ -0,0 +1,13 @@
+@inproceedings{bobb:wdas06,
+ address = {San Jose, CA},
+ author = {Nikhil Bobb and Damian Eads and Mark W. Storer and Scott A. Brandt and Carlos Maltzahn and Ethan L. Miller},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxARQi9ib2JiLXdkYXMwNi5wZGZPEQFcAAAAAAFcAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8PYm9iYi13ZGFzMDYucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAUIAAAIANy86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6Qjpib2JiLXdkYXMwNi5wZGYAAA4AIAAPAGIAbwBiAGIALQB3AGQAYQBzADAANgAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAIi9NeSBEcml2ZS9QYXBlcnMvQi9ib2JiLXdkYXMwNi5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAOAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGY},
+ booktitle = {Proceedings of the 7th International Workshop on Distributed Data and Structures (WDAS 2006)},
+ date-added = {2009-09-29 12:08:09 -0700},
+ date-modified = {2009-12-14 11:53:59 -0800},
+ local-url = {/Users/carlosmalt/Documents/Papers/bobb-wdas06.pdf},
+ month = {January},
+ title = {Graffiti: A Framework for Testing Collaborative Distributed Metadata},
+ year = {2006}
+}
+
diff --git a/content/publication/bobb-wdas-06/index.md b/content/publication/bobb-wdas-06/index.md
new file mode 100644
index 00000000000..75da40cf281
--- /dev/null
+++ b/content/publication/bobb-wdas-06/index.md
@@ -0,0 +1,11 @@
+---
+title: "Graffiti: A Framework for Testing Collaborative Distributed Metadata"
+date: 2006-01-01
+publishDate: 2020-01-05T06:43:50.651198Z
+authors: ["Nikhil Bobb", "Damian Eads", "Mark W. Storer", "Scott A. Brandt", "Carlos Maltzahn", "Ethan L. Miller"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*Proceedings of the 7th International Workshop on Distributed Data and Structures (WDAS 2006)*"
+---
+
diff --git a/content/publication/brandt-ospert-08/cite.bib b/content/publication/brandt-ospert-08/cite.bib
new file mode 100644
index 00000000000..639a0c62c48
--- /dev/null
+++ b/content/publication/brandt-ospert-08/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{brandt:ospert08,
+ abstract = {Real-time systems are growing in size and complexity and must often manage multiple competing tasks in environments where CPU is not the only limited shared resource. Memory, network, and other devices may also be shared and system-wide performance guarantees may require the allocation and scheduling of many diverse resources. We present our on-going work on performance management in a representative distributed real-time system---a distributed storage system with performance requirements---and discuss our integrated model for managing diverse resources to provide end-to-end performance guarantees.},
+ address = {Prague, Czech Republic},
+ author = {Scott A. Brandt and Carlos Maltzahn and Anna Povzner and Roberto Pineiro and Andrew Shewmaker and Tim Kaldewey},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAVQi9icmFuZHQtb3NwZXJ0MDgucGRmTxEBbAAAAAABbAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////E2JyYW5kdC1vc3BlcnQwOC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFCAAACADsvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOkI6YnJhbmR0LW9zcGVydDA4LnBkZgAADgAoABMAYgByAGEAbgBkAHQALQBvAHMAcABlAHIAdAAwADgALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACYvTXkgRHJpdmUvUGFwZXJzL0IvYnJhbmR0LW9zcGVydDA4LnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA8AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAaw=},
+ booktitle = {OSPERT 2008},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:01:44 -0700},
+ keywords = {papers, storage, systems, distributed, performance, management, qos, realtime},
+ month = {July},
+ title = {An Integrated Model for Performance Management in a Distributed System},
+ year = {2008}
+}
+
diff --git a/content/publication/brandt-ospert-08/index.md b/content/publication/brandt-ospert-08/index.md
new file mode 100644
index 00000000000..4eae8f8270d
--- /dev/null
+++ b/content/publication/brandt-ospert-08/index.md
@@ -0,0 +1,12 @@
+---
+title: "An Integrated Model for Performance Management in a Distributed System"
+date: 2008-07-01
+publishDate: 2020-01-05T13:33:05.996623Z
+authors: ["Scott A. Brandt", "Carlos Maltzahn", "Anna Povzner", "Roberto Pineiro", "Andrew Shewmaker", "Tim Kaldewey"]
+publication_types: ["1"]
+abstract: "Real-time systems are growing in size and complexity and must often manage multiple competing tasks in environments where CPU is not the only limited shared resource. Memory, network, and other devices may also be shared and system-wide performance guarantees may require the allocation and scheduling of many diverse resources. We present our on-going work on performance management in a representative distributed real-time system---a distributed storage system with performance requirements---and discuss our integrated model for managing diverse resources to provide end-to-end performance guarantees."
+featured: false
+publication: "*OSPERT 2008*"
+tags: ["papers", "storage", "systems", "distributed", "performance", "management", "qos", "realtime"]
+---
+
diff --git a/content/publication/brandt-pdsw-09/cite.bib b/content/publication/brandt-pdsw-09/cite.bib
new file mode 100644
index 00000000000..86f00690edf
--- /dev/null
+++ b/content/publication/brandt-pdsw-09/cite.bib
@@ -0,0 +1,15 @@
+@inproceedings{brandt:pdsw09,
+ abstract = {File systems are the backbone of large-scale data processing for scientific applications. Motivated by the need to provide an extensible and flexible framework beyond the abstractions provided by API libraries for files to manage and analyze large-scale data, we are developing Damasc, an enhanced file system where rich data management services for scientific computing are provided as a native part of the file system.
+This paper presents our vision for Damasc, a performant file system that would allow scientists or even casual users to pose declarative queries and updates over views of underlying files that are stored in their native bytestream format. In Damasc, a configurable layer is added on top of the file system to expose the contents of files in a logical data model through which views can be defined and used for queries and updates. The logical data model and views are leveraged to optimize access to files through caching and self-organizing indexing. In addition, provenance capture and analysis to file access is also built into Damasc. We describe the salient features of our proposal and discuss how it can benefit the development of scientific code.},
+ address = {Portland, OR},
+ author = {Scott A. Brandt and Carlos Maltzahn and Neoklis Polyzotis and Wang-Chiew Tan},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxATQi9icmFuZHQtcGRzdzA5LnBkZk8RAWQAAAAAAWQAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xFicmFuZHQtcGRzdzA5LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABQgAAAgA5LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpCOmJyYW5kdC1wZHN3MDkucGRmAAAOACQAEQBiAHIAYQBuAGQAdAAtAHAAZABzAHcAMAA5AC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAkL015IERyaXZlL1BhcGVycy9CL2JyYW5kdC1wZHN3MDkucGRmABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkADoAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABog==},
+ booktitle = {Proceedings of the 2009 ACM Petascale Data Storage Workshop (PDSW 09)},
+ date-added = {2010-01-26 23:50:43 -0800},
+ date-modified = {2020-01-05 05:49:01 -0700},
+ keywords = {papers, datamanagement, filesystems},
+ month = {November 15},
+ title = {Fusing Data Management Services with File Systems},
+ year = {2009}
+}
+
diff --git a/content/publication/brandt-pdsw-09/index.md b/content/publication/brandt-pdsw-09/index.md
new file mode 100644
index 00000000000..2761e2f2b60
--- /dev/null
+++ b/content/publication/brandt-pdsw-09/index.md
@@ -0,0 +1,14 @@
+---
+title: "Fusing Data Management Services with File Systems"
+date: 2009-11-01
+publishDate: 2020-01-05T13:33:05.989862Z
+authors: ["Scott A. Brandt", "Carlos Maltzahn", "Neoklis Polyzotis", "Wang-Chiew Tan"]
+publication_types: ["1"]
+abstract: "File systems are the backbone of large-scale data processing for scientific applications. Motivated by the need to provide an extensible and flexible framework beyond the abstractions provided by API libraries for files to manage and analyze large-scale data, we are developing Damasc, an enhanced file system where rich data management services for scientific computing are provided as a native part of the file system. This paper presents our vision for Damasc, a performant file system that would allow scientists or even casual users to pose declarative queries and updates over views of underlying files that are stored in their native bytestream format. In Damasc, a configurable layer is added on top of the file system to expose the contents of files in a logical data model through which views can be defined and used for queries and updates. The logical data model and views are leveraged to optimize access to files through caching and self-organizing indexing. In addition, provenance capture and analysis to file access is also built into Damasc. We describe the salient features of our proposal and discuss how it can benefit the development of scientific code."
+featured: false
+publication: "*Proceedings of the 2009 ACM Petascale Data Storage Workshop (PDSW 09)*"
+tags: ["papers", "datamanagement", "filesystems"]
+projects:
+- programmable-storage
+---
+
diff --git a/content/publication/brummell-pmes-16/cite.bib b/content/publication/brummell-pmes-16/cite.bib
new file mode 100644
index 00000000000..5e055aec5fa
--- /dev/null
+++ b/content/publication/brummell-pmes-16/cite.bib
@@ -0,0 +1,13 @@
+@inproceedings{brummell:pmes16,
+ abstract = {To raise performance beyond Moore's law scaling, Approximate Computing reduces arithmetic quality to increase operations per second or per joule. It works on only a few applications. The quality-speed tradeoff seems inescapable; however, Unum Arithmetic simultaneously raises arithmetic quality yet reduces the number of bits required. Unums extend IEEE floats (type 1) or provide custom number systems to maximize information per bit (type 2). Unums achieve Approximate Computing cost savings without sacrificing answer quality.},
+ author = {Nic Brummell and John L. Gustafson and Andrew Klofas and Carlos Maltzahn and Andrew Shewmaker},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAVQi9icnVtbWVsbC1wbWVzMTYucGRmTxEBbAAAAAABbAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////E2JydW1tZWxsLXBtZXMxNi5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFCAAACADsvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOkI6YnJ1bW1lbGwtcG1lczE2LnBkZgAADgAoABMAYgByAHUAbQBtAGUAbABsAC0AcABtAGUAcwAxADYALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACYvTXkgRHJpdmUvUGFwZXJzL0IvYnJ1bW1lbGwtcG1lczE2LnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA8AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAaw=},
+ booktitle = {PMES 2016},
+ date-added = {2016-10-21 17:31:51 +0000},
+ date-modified = {2020-01-04 21:47:19 -0700},
+ keywords = {papers, math, computation},
+ month = {November 14},
+ title = {Unum Arithmetic: Better Math with Clearer Tradeoffs},
+ year = {2016}
+}
+
diff --git a/content/publication/brummell-pmes-16/index.md b/content/publication/brummell-pmes-16/index.md
new file mode 100644
index 00000000000..4f42f3edcd1
--- /dev/null
+++ b/content/publication/brummell-pmes-16/index.md
@@ -0,0 +1,12 @@
+---
+title: "Unum Arithmetic: Better Math with Clearer Tradeoffs"
+date: 2016-11-01
+publishDate: 2020-01-05T06:43:50.454363Z
+authors: ["Nic Brummell", "John L. Gustafson", "Andrew Klofas", "Carlos Maltzahn", "Andrew Shewmaker"]
+publication_types: ["1"]
+abstract: "To raise performance beyond Moore's law scaling, Approximate Computing reduces arithmetic quality to increase operations per second or per joule. It works on only a few applications. The quality-speed tradeoff seems inescapable; however, Unum Arithmetic simultaneously raises arithmetic quality yet reduces the number of bits required. Unums extend IEEE floats (type 1) or provide custom number systems to maximize information per bit (type 2). Unums achieve Approximate Computing cost savings without sacrificing answer quality."
+featured: false
+publication: "*PMES 2016*"
+tags: ["papers", "math", "computation"]
+---
+
diff --git a/content/publication/buck-dadc-09/cite.bib b/content/publication/buck-dadc-09/cite.bib
new file mode 100644
index 00000000000..b5e08f6dadd
--- /dev/null
+++ b/content/publication/buck-dadc-09/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{buck:dadc09,
+ abstract = {High-end computing is increasingly I/O bound as computations become more data-intensive, and data transport technologies struggle to keep pace with the demands of large-scale, distributed computations. One approach to avoiding unnecessary I/O is to move the processing to the data, as seen in Google's successful, but relatively specialized, MapReduce system. This paper discusses our investigation towards a general solution for enabling in-situ computation in a peta-scale storage system. We believe our work with flexible, application-specific structured storage is the key to addressing the I/O overhead caused by data partitioning across storage nodes. In order to manage competing workloads on storage nodes, our research in system performance management is leveraged. Our ultimate goal is a general framework for in-situ data-intensive processing, indexing, and searching, which we expect to provide orders of magnitude performance increases for data-intensive workloads.},
+ address = {Munich, Germany},
+ author = {Joe Buck and Noah Watkins and Carlos Maltzahn and Scott A. Brandt},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxARQi9idWNrLWRhZGMwOS5wZGZPEQFcAAAAAAFcAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8PYnVjay1kYWRjMDkucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAUIAAAIANy86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6QjpidWNrLWRhZGMwOS5wZGYAAA4AIAAPAGIAdQBjAGsALQBkAGEAZABjADAAOQAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAIi9NeSBEcml2ZS9QYXBlcnMvQi9idWNrLWRhZGMwOS5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAOAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGY},
+ booktitle = {2nd International Workshop on Data-Aware Distributed Computing (in conjunction with HPDC-18)},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:01:11 -0700},
+ keywords = {papers, filesystems, programmable},
+ month = {June 9},
+ title = {Abstract Storage: Moving file format-specific abstractions into petabyte-scale storage systems},
+ year = {2009}
+}
+
diff --git a/content/publication/buck-dadc-09/index.md b/content/publication/buck-dadc-09/index.md
new file mode 100644
index 00000000000..90044710833
--- /dev/null
+++ b/content/publication/buck-dadc-09/index.md
@@ -0,0 +1,14 @@
+---
+title: "Abstract Storage: Moving file format-specific abstractions into petabyte-scale storage systems"
+date: 2009-06-01
+publishDate: 2020-01-05T13:33:05.993353Z
+authors: ["Joe Buck", "Noah Watkins", "Carlos Maltzahn", "Scott A. Brandt"]
+publication_types: ["1"]
+abstract: "High-end computing is increasingly I/O bound as computations become more data-intensive, and data transport technologies struggle to keep pace with the demands of large-scale, distributed computations. One approach to avoiding unnecessary I/O is to move the processing to the data, as seen in Google's successful, but relatively specialized, MapReduce system. This paper discusses our investigation towards a general solution for enabling in-situ computation in a peta-scale storage system. We believe our work with flexible, application-specific structured storage is the key to addressing the I/O overhead caused by data partitioning across storage nodes. In order to manage competing workloads on storage nodes, our research in system performance management is leveraged. Our ultimate goal is a general framework for in-situ data-intensive processing, indexing, and searching, which we expect to provide orders of magnitude performance increases for data-intensive workloads."
+featured: false
+publication: "*2nd International Workshop on Data-Aware Distributed Computing (in conjunction with HPDC-18)*"
+tags: ["papers", "filesystems", "programmable"]
+projects:
+- programmable-storage
+---
+
diff --git a/content/publication/buck-sc-11/cite.bib b/content/publication/buck-sc-11/cite.bib
new file mode 100644
index 00000000000..0450a0aca1d
--- /dev/null
+++ b/content/publication/buck-sc-11/cite.bib
@@ -0,0 +1,15 @@
+@inproceedings{buck:sc11,
+ abstract = {Hadoop has become the de facto platform for large-scale data analysis in commercial applications, and increasingly so in scientific applications. However, Hadoop's byte stream data model causes inefficiencies when used to process scientific data that is commonly stored in highly-structured, array-based binary file formats resulting in limited scalability of Hadoop applications in science. We introduce SciHadoop, a Hadoop plugin allowing scientists to specify logical queries over array-based data models. SciHadoop executes queries as map/reduce programs defined over the logical data model. We describe the implementation of a SciHadoop prototype for NetCDF data sets and quantify the performance of five separate optimizations that address the following goals for several representative aggregate queries: reduce total data transfers, reduce remote reads, and reduce unnecessary reads. Two optimizations allow holistic aggregate queries to be evaluated opportunistically during the map phase; two additional optimizations intelligently partition input data to increase read locality, and one optimization avoids block scans by examining the data dependencies of an executing query to prune input partitions. Experiments involving a holistic function show run-time improvements of up to 8x, with drastic reductions of IO, both locally and over the network.},
+ address = {Seattle, WA},
+ author = {Joe Buck and Noah Watkins and Jeff LeFevre and Kleoni Ioannidou and Carlos Maltzahn and Neoklis Polyzotis and Scott A. Brandt},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAPQi9idWNrLXNjMTEucGRmTxEBVAAAAAABVAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////DWJ1Y2stc2MxMS5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFCAAACADUvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOkI6YnVjay1zYzExLnBkZgAADgAcAA0AYgB1AGMAawAtAHMAYwAxADEALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACAvTXkgRHJpdmUvUGFwZXJzL0IvYnVjay1zYzExLnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA2AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAY4=},
+ booktitle = {SC '11},
+ date-added = {2011-08-02 22:58:10 +0000},
+ date-modified = {2020-01-05 05:34:48 -0700},
+ keywords = {papers, mapreduce, datamanagement, hpc, structured, netcdf},
+ month = {November},
+ read = {1},
+ title = {SciHadoop: Array-based Query Processing in Hadoop},
+ year = {2011}
+}
+
diff --git a/content/publication/buck-sc-11/index.md b/content/publication/buck-sc-11/index.md
new file mode 100644
index 00000000000..b9f963caa3f
--- /dev/null
+++ b/content/publication/buck-sc-11/index.md
@@ -0,0 +1,14 @@
+---
+title: "SciHadoop: Array-based Query Processing in Hadoop"
+date: 2011-11-01
+publishDate: 2020-01-05T13:33:05.981084Z
+authors: ["Joe Buck", "Noah Watkins", "Jeff LeFevre", "Kleoni Ioannidou", "Carlos Maltzahn", "Neoklis Polyzotis", "Scott A. Brandt"]
+publication_types: ["1"]
+abstract: "Hadoop has become the de facto platform for large-scale data analysis in commercial applications, and increasingly so in scientific applications. However, Hadoop's byte stream data model causes inefficiencies when used to process scientific data that is commonly stored in highly-structured, array-based binary file formats resulting in limited scalability of Hadoop applications in science. We introduce SciHadoop, a Hadoop plugin allowing scientists to specify logical queries over array-based data models. SciHadoop executes queries as map/reduce programs defined over the logical data model. We describe the implementation of a SciHadoop prototype for NetCDF data sets and quantify the performance of five separate optimizations that address the following goals for several representative aggregate queries: reduce total data transfers, reduce remote reads, and reduce unnecessary reads. Two optimizations allow holistic aggregate queries to be evaluated opportunistically during the map phase; two additional optimizations intelligently partition input data to increase read locality, and one optimization avoids block scans by examining the data dependencies of an executing query to prune input partitions. Experiments involving a holistic function show run-time improvements of up to 8x, with drastic reductions of IO, both locally and over the network."
+featured: false
+publication: "*SC '11*"
+tags: ["papers", "mapreduce", "datamanagement", "hpc", "structured", "netcdf"]
+projects:
+- programmable-storage
+---
+
diff --git a/content/publication/buck-sc-13/cite.bib b/content/publication/buck-sc-13/cite.bib
new file mode 100644
index 00000000000..4ebf5d832c0
--- /dev/null
+++ b/content/publication/buck-sc-13/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{buck:sc13,
+ abstract = {The MapReduce framework is being extended for domains quite different from the web applications for which it was designed, including the processing of big structured data, e.g., scientific and financial data. Previous work using MapReduce to process scientific data ignores existing structure when assigning intermediate data and scheduling tasks. In this paper, we present a method for incorporating knowledge of the structure of scientific data and executing query into the MapReduce communication model. Built in SciHadoop, a version of the Hadoop MapReduce framework for scientific data, SIDR intelligently partitions and routes intermediate data, allowing it to: remove Hadoop's global barrier and execute Reduce tasks prior to all Map tasks completing; minimize intermediate key skew; and produce early, correct results. SIDR executes queries up to 2.5 times faster than Hadoop and 37% faster than SciHadoop; produces initial results with only 6% of the query completed; and produces dense, contiguous output.},
+ address = {Denver, CO},
+ author = {Joe Buck and Noah Watkins and Greg Levin and Adam Crume and Kleoni Ioannidou and Scott Brandt and Carlos Maltzahn and Neoklis Polyzotis and Aaron Torres},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAPQi9idWNrLXNjMTMucGRmTxEBVAAAAAABVAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////DWJ1Y2stc2MxMy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFCAAACADUvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOkI6YnVjay1zYzEzLnBkZgAADgAcAA0AYgB1AGMAawAtAHMAYwAxADMALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACAvTXkgRHJpdmUvUGFwZXJzL0IvYnVjay1zYzEzLnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA2AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAY4=},
+ booktitle = {SC '13},
+ date-added = {2013-07-21 00:28:59 +0000},
+ date-modified = {2020-01-04 23:20:15 -0700},
+ keywords = {papers, mapreduce, structured, datamanagement, routing, hpc},
+ month = {November},
+ title = {SIDR: Structure-Aware Intelligent Data Routing in Hadoop},
+ year = {2013}
+}
+
diff --git a/content/publication/buck-sc-13/index.md b/content/publication/buck-sc-13/index.md
new file mode 100644
index 00000000000..b4152d5537e
--- /dev/null
+++ b/content/publication/buck-sc-13/index.md
@@ -0,0 +1,14 @@
+---
+title: "SIDR: Structure-Aware Intelligent Data Routing in Hadoop"
+date: 2013-11-01
+publishDate: 2020-01-05T06:43:50.532572Z
+authors: ["Joe Buck", "Noah Watkins", "Greg Levin", "Adam Crume", "Kleoni Ioannidou", "Scott Brandt", "Carlos Maltzahn", "Neoklis Polyzotis", "Aaron Torres"]
+publication_types: ["1"]
+abstract: "The MapReduce framework is being extended for domains quite different from the web applications for which it was designed, including the processing of big structured data, e.g., scientific and financial data. Previous work using MapReduce to process scientific data ignores existing structure when assigning intermediate data and scheduling tasks. In this paper, we present a method for incorporating knowledge of the structure of scientific data and executing query into the MapReduce communication model. Built in SciHadoop, a version of the Hadoop MapReduce framework for scientific data, SIDR intelligently partitions and routes intermediate data, allowing it to: remove Hadoop's global barrier and execute Reduce tasks prior to all Map tasks completing; minimize intermediate key skew; and produce early, correct results. SIDR executes queries up to 2.5 times faster than Hadoop and 37% faster than SciHadoop; produces initial results with only 6% of the query completed; and produces dense, contiguous output."
+featured: false
+publication: "*SC '13*"
+tags: ["papers", "mapreduce", "structured", "datamanagement", "routing", "hpc"]
+projects:
+- programmable-storage
+---
+
diff --git a/content/publication/buck-tr-0411/cite.bib b/content/publication/buck-tr-0411/cite.bib
new file mode 100644
index 00000000000..fd7e90ea18d
--- /dev/null
+++ b/content/publication/buck-tr-0411/cite.bib
@@ -0,0 +1,12 @@
+@techreport{buck:tr0411,
+ author = {Joe Buck and Noah Watkins and Jeff LeFevre and Kleoni Ioannidou and Carlos Maltzahn and Neoklis Polyzotis and Scott A. Brandt},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxARQi9idWNrLXRyMDQxMS5wZGZPEQFcAAAAAAFcAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8PYnVjay10cjA0MTEucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAUIAAAIANy86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6QjpidWNrLXRyMDQxMS5wZGYAAA4AIAAPAGIAdQBjAGsALQB0AHIAMAA0ADEAMQAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAIi9NeSBEcml2ZS9QYXBlcnMvQi9idWNrLXRyMDQxMS5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAOAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGY},
+ date-added = {2011-05-27 00:06:15 -0700},
+ date-modified = {2011-05-27 00:15:42 -0700},
+ institution = {UCSC},
+ month = {April},
+ number = {UCSC-SOE-11-04},
+ title = {SciHadoop: Array-based Query Processing in Hadoop},
+ year = {2011}
+}
+
diff --git a/content/publication/buck-tr-0411/index.md b/content/publication/buck-tr-0411/index.md
new file mode 100644
index 00000000000..066e21af714
--- /dev/null
+++ b/content/publication/buck-tr-0411/index.md
@@ -0,0 +1,11 @@
+---
+title: "SciHadoop: Array-based Query Processing in Hadoop"
+date: 2011-04-01
+publishDate: 2020-01-05T12:39:43.053721Z
+authors: ["Joe Buck", "Noah Watkins", "Jeff LeFevre", "Kleoni Ioannidou", "Carlos Maltzahn", "Neoklis Polyzotis", "Scott A. Brandt"]
+publication_types: ["4"]
+abstract: ""
+featured: false
+publication: ""
+---
+
diff --git a/content/publication/buck-ucsctr-12/cite.bib b/content/publication/buck-ucsctr-12/cite.bib
new file mode 100644
index 00000000000..61461cbe9f1
--- /dev/null
+++ b/content/publication/buck-ucsctr-12/cite.bib
@@ -0,0 +1,15 @@
+@techreport{buck:ucsctr12,
+ address = {Santa Cruz, CA},
+ author = {Joe Buck and Noah Watkins and Greg Levin and Adam Crume and Kleoni Ioannidou and Scott Brandt and Carlos Maltzahn and Neoklis Polyzotis},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxATQi9idWNrLXVjc2N0cjEyLnBkZk8RAWQAAAAAAWQAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xFidWNrLXVjc2N0cjEyLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABQgAAAgA5LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpCOmJ1Y2stdWNzY3RyMTIucGRmAAAOACQAEQBiAHUAYwBrAC0AdQBjAHMAYwB0AHIAMQAyAC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAkL015IERyaXZlL1BhcGVycy9CL2J1Y2stdWNzY3RyMTIucGRmABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkADoAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABog==},
+ date-added = {2013-05-30 22:56:59 +0000},
+ date-modified = {2013-05-30 22:59:07 +0000},
+ institution = {University of California Santa Cruz},
+ keywords = {papers, mapreduce, hadoop, hpc, communication, networking, structured, datamanagement},
+ month = {July 26},
+ number = {UCSC-SOE-12-08},
+ title = {Structure-Aware Intelligent Data Routing in SciHadoop},
+ type = {Technical Report},
+ year = {2012}
+}
+
diff --git a/content/publication/buck-ucsctr-12/index.md b/content/publication/buck-ucsctr-12/index.md
new file mode 100644
index 00000000000..218021df866
--- /dev/null
+++ b/content/publication/buck-ucsctr-12/index.md
@@ -0,0 +1,12 @@
+---
+title: "Structure-Aware Intelligent Data Routing in SciHadoop"
+date: 2012-07-01
+publishDate: 2020-01-05T06:43:50.538387Z
+authors: ["Joe Buck", "Noah Watkins", "Greg Levin", "Adam Crume", "Kleoni Ioannidou", "Scott Brandt", "Carlos Maltzahn", "Neoklis Polyzotis"]
+publication_types: ["4"]
+abstract: ""
+featured: false
+publication: ""
+tags: ["papers", "mapreduce", "hadoop", "hpc", "communication", "networking", "structured", "datamanagement"]
+---
+
diff --git a/content/publication/chakraborty-arrowblog-22/cite.bib b/content/publication/chakraborty-arrowblog-22/cite.bib
new file mode 100644
index 00000000000..b3877907341
--- /dev/null
+++ b/content/publication/chakraborty-arrowblog-22/cite.bib
@@ -0,0 +1,12 @@
+@unpublished{chakraborty:arrowblog22,
+ author = {Jayjeet Chakraborty and Carlos Maltzahn and David Li and Tom Drabas },
+ bdsk-url-1 = {https://arrow.apache.org/blog/2022/01/31/skyhook-bringing-computation-to-storage-with-apache-arrow/},
+ date-added = {2022-05-06 12:28:50 -0700},
+ date-modified = {2022-05-06 12:28:50 -0700},
+ keywords = {computation, storage, programmable, datamanagement, ceph, arrow},
+ month = {January 31},
+ note = {Available at arrow.apache.org/blog/2022/01/31/skyhook-bringing-computation-to-storage-with-apache-arrow/},
+ title = { Skyhook: Bringing Computation to Storage with Apache Arrow },
+ year = {2022}
+}
+
diff --git a/content/publication/chakraborty-arrowblog-22/index.md b/content/publication/chakraborty-arrowblog-22/index.md
new file mode 100644
index 00000000000..68d4b8d06e9
--- /dev/null
+++ b/content/publication/chakraborty-arrowblog-22/index.md
@@ -0,0 +1,53 @@
+---
+# Documentation: https://wowchemy.com/docs/managing-content/
+
+title: ' Skyhook: Bringing Computation to Storage with Apache Arrow '
+subtitle: ''
+summary: ''
+authors:
+- Jayjeet Chakraborty
+- Carlos Maltzahn
+- David Li
+- Tom Drabas
+tags:
+- computation
+- storage
+- programmable
+- datamanagement
+- ceph
+- arrow
+categories: []
+date: 2022-01-31
+lastmod: 2023-07-05
+featured: false
+draft: false
+
+url_: https://arrow.apache.org/blog/2022/01/31/skyhook-bringing-computation-to-storage-with-apache-arrow/
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Focal points: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight.
+image:
+ caption: ''
+ focal_point: ''
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects:
+- skyhook
+- eusocial-storage
+- programmable-storage
+publishDate: '2022-05-08T18:22:08.329823Z'
+publication_types:
+- '0'
+abstract: 'CPUs, memory, storage, and network bandwidth get better every year, but increasingly, they’re improving in different dimensions. Processors are faster, but their memory bandwidth hasn’t kept up; meanwhile, cloud computing has led to storage being separated from applications across a network link. This divergent evolution means we need to rethink where and when we perform computation to best make use of the resources available to us.
+
+For example, when querying a dataset on a storage system like Ceph or Amazon S3, all the work of filtering data gets done by the client. Data has to be transferred over the network, and then the client has to spend precious CPU cycles decoding it, only to throw it away in the end due to a filter. While formats like Apache Parquet enable some optimizations, fundamentally, the responsibility is all on the client. Meanwhile, even though the storage system has its own compute capabilities, it’s relegated to just serving “dumb bytes”.
+
+Thanks to the [Center for Research in Open Source Software](https://cross.ucsc.edu) (CROSS) at the University of California, Santa Cruz, Apache Arrow 7.0.0 includes Skyhook, an [Arrow Datasets](https://arrow.apache.org/docs/cpp/dataset.html) extension that solves this problem by using the storage layer to reduce client resource utilization. We’ll examine the developments surrounding Skyhook as well as how Skyhook works.'
+publication: '[Apache Arrow Blog, January 31, 2022](https://arrow.apache.org/blog/2022/01/31/skyhook-bringing-computation-to-storage-with-apache-arrow/)'
+---
diff --git a/content/publication/chakraborty-arxiv-21/cite.bib b/content/publication/chakraborty-arxiv-21/cite.bib
new file mode 100644
index 00000000000..ecccdd9069b
--- /dev/null
+++ b/content/publication/chakraborty-arxiv-21/cite.bib
@@ -0,0 +1,12 @@
+@unpublished{chakraborty:arxiv21,
+ author = {Jayjeet Chakraborty and Ivo Jimenez and Sebastiaan Alvarez Rodriguez and Alexandru Uta and Jeff LeFevre and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAyLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0MvY2hha3JhYm9ydHktYXJ4aXYyMS5wZGZPEQF8AAAAAAF8AAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8XY2hha3JhYm9ydHktYXJ4aXYyMS5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUMAAAIAPC86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6QzpjaGFrcmFib3J0eS1hcnhpdjIxLnBkZgAOADAAFwBjAGgAYQBrAHIAYQBiAG8AcgB0AHkALQBhAHIAeABpAHYAMgAxAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA6VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvQy9jaGFrcmFib3J0eS1hcnhpdjIxLnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABZAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAdk=},
+ date-added = {2021-07-23 10:50:21 -0700},
+ date-modified = {2021-07-23 10:50:21 -0700},
+ keywords = {papers, programmable, storage, systems, arrow},
+ month = {May 21},
+ note = {arxiv.org/abs/2105.09894 [cs.DC]},
+ title = {Towards an Arrow-native Storage System},
+ year = {2021}
+}
+
diff --git a/content/publication/chakraborty-arxiv-21/index.md b/content/publication/chakraborty-arxiv-21/index.md
new file mode 100644
index 00000000000..03b8166e774
--- /dev/null
+++ b/content/publication/chakraborty-arxiv-21/index.md
@@ -0,0 +1,16 @@
+---
+title: "Towards an Arrow-native Storage System"
+date: 2021-05-01
+publishDate: 2021-07-23T18:52:38.470304Z
+authors: ["Jayjeet Chakraborty", "Ivo Jimenez", "Sebastiaan Alvarez Rodriguez", "Alexandru Uta", "Jeff LeFevre", "Carlos Maltzahn"]
+publication_types: ["3"]
+abstract: "With the ever-increasing dataset sizes, several file formats like Parquet, ORC, and Avro have been developed to store data efficiently and to save network and interconnect bandwidth at the price of additional CPU utilization. However, with the advent of networks supporting 25-100 Gb/s and storage devices delivering 1,000,000 reqs/sec the CPU has become the bottleneck, trying to keep up feeding data in and out of these fast devices. The result is that data access libraries executed on single clients are often CPU-bound and cannot utilize the scale-out benefits of distributed storage systems. One attractive solution to this problem is to offload data-reducing processing and filtering tasks to the storage layer. However, modifying legacy storage systems to support compute offloading is often tedious and requires extensive understanding of the internals. Previous approaches re-implemented functionality of data processing frameworks and access library for a particular storage system, a duplication of effort that might have to be repeated for different storage systems. In this paper, we introduce a new design paradigm that allows extending programmable object storage systems to embed existing, widely used data processing frameworks and access libraries into the storage layer with minimal modifications. In this approach data processing frameworks and access libraries can evolve independently from storage systems while leveraging the scale-out and availability properties of distributed storage systems. We present one example implementation of our design paradigm using Ceph, Apache Arrow, and Parquet. We provide a brief performance evaluation of our implementation and discuss key results."
+featured: false
+publication: "arXiv:2105.09894 [cs.DC]"
+tags: ["papers", "programmable", "storage", "systems", "arrow"]
+projects:
+- programmable-storage
+- declstore
+- eusocial-storage
+- skyhook
+---
diff --git a/content/publication/chakraborty-canopie-20/cite.bib b/content/publication/chakraborty-canopie-20/cite.bib
new file mode 100644
index 00000000000..a09183a8778
--- /dev/null
+++ b/content/publication/chakraborty-canopie-20/cite.bib
@@ -0,0 +1,13 @@
+@inproceedings{chakraborty:canopie20,
+ author = {Jayjeet Chakraborty and Carlos Maltzahn and Ivo Jimenez},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxBOLi4vLi4vLi4vLi4vLi4vVm9sdW1lcy9Hb29nbGVEcml2ZS9NeSBEcml2ZS9QYXBlcnMvQy9jaGFrcmFib3J0eS1jYW5vcGllMjAucGRmTxEBhAAAAAABhAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////GWNoYWtyYWJvcnR5LWNhbm9waWUyMC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAD/////AAAKAGN1AAAAAAAAAAAAAAAAAAFDAAACAEEvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOkM6Y2hha3JhYm9ydHktY2Fub3BpZTIwLnBkZgAADgA0ABkAYwBoAGEAawByAGEAYgBvAHIAdAB5AC0AYwBhAG4AbwBwAGkAZQAyADAALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACwvTXkgRHJpdmUvUGFwZXJzL0MvY2hha3JhYm9ydHktY2Fub3BpZTIwLnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAB1AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAf0=},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxBWLi4vLi4vLi4vLi4vLi4vVm9sdW1lcy9Hb29nbGVEcml2ZS9NeSBEcml2ZS9QYXBlcnMvQy9jaGFrcmFib3J0eS1jYW5vcGllLTIwLXNsaWRlcy5wZGZPEQGkAAAAAAGkAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8fY2hha3JhYm9ydHktY2Fub3BpI0ZGRkZGRkZGLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAP////8AAAoAY3UAAAAAAAAAAAAAAAAAAUMAAAIASS86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6QzpjaGFrcmFib3J0eS1jYW5vcGllLTIwLXNsaWRlcy5wZGYAAA4ARAAhAGMAaABhAGsAcgBhAGIAbwByAHQAeQAtAGMAYQBuAG8AcABpAGUALQAyADAALQBzAGwAaQBkAGUAcwAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIANC9NeSBEcml2ZS9QYXBlcnMvQy9jaGFrcmFib3J0eS1jYW5vcGllLTIwLXNsaWRlcy5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAfQAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAIl},
+ booktitle = {CANOPIE HPC 2020 (at SC20)},
+ date-added = {2020-11-30 07:28:21 -0800},
+ date-modified = {2020-11-30 07:28:21 -0800},
+ keywords = {papers, reproducibility, containers, workflowl, orchestration},
+ month = {November 12},
+ title = {Enabling seamless execution of computational and data science workflows on HPC and cloud with the Popper container-native automation engine},
+ year = {2020}
+}
+
diff --git a/content/publication/chakraborty-canopie-20/index.md b/content/publication/chakraborty-canopie-20/index.md
new file mode 100644
index 00000000000..754b70d901a
--- /dev/null
+++ b/content/publication/chakraborty-canopie-20/index.md
@@ -0,0 +1,13 @@
+---
+title: "Enabling seamless execution of computational and data science workflows on HPC and cloud with the Popper container-native automation engine"
+date: 2020-11-01
+publishDate: 2020-12-09T04:47:36.222445Z
+authors: ["Jayjeet Chakraborty", "Carlos Maltzahn", "Ivo Jimenez"]
+publication_types: ["1"]
+abstract: "The problem of reproducibility and replication in scientific research is quite prevalent to date. Researchers working in fields of computational science often find it difficult to reproduce experiments from artifacts like code, data, diagrams, and results which are left behind by the previous researchers.The code developed on one machine often fails to run on other machines due to differences in hardware architecture, OS, software dependencies, among others. This is accompanied by the difficulty in understanding how artifacts are organized, as well as in using them in the correct order. Software containers(also known as Linux containers) can be used to address some of these problems, and thus researchers and developers have built scientific workflow engines that execute the steps of a workflow in separate containers. Existing container-native workflow engines assume the availability of infrastructure deployed in the cloud or HPC centers. In this paper, we present Popper, a container-native workflow engine that does not assume the presence of a Kubernetes cluster or any service other than a container engine such as Docker or Podman. We introduce the design and architecture of Popper and describe how it abstracts away the complexity of multiple container engines and resource managers, enabling users to focus only on writing workflow logic. With Popper, researchers can build and validate workflows easily in almost any environment of their choice including local machines, Slurm based HPC clusters, CI services, or Kubernetes based cloud computing environments. To exemplify the suitability of this workflow engine, we present a case study where we take an example from machine learning and seamlessly execute it in multiple environments by implementing a Popper workflow for it."
+featured: false
+publication: "*CANOPIE HPC 2020 (at SC20)*"
+tags: ["papers", "reproducibility", "containers", "workflow", "orchestration"]
+projects:
+- practical-reproducibility
+---
diff --git a/content/publication/chakraborty-ccgrid-22/cite.bib b/content/publication/chakraborty-ccgrid-22/cite.bib
new file mode 100644
index 00000000000..3d1d4e6800b
--- /dev/null
+++ b/content/publication/chakraborty-ccgrid-22/cite.bib
@@ -0,0 +1,16 @@
+@inproceedings{chakraborty:ccgrid22,
+ abstract = {With the ever-increasing dataset sizes, several file formats such as Parquet, ORC, and Avro have been developed to store data efficiently, save the network, and interconnect bandwidth at the price of additional CPU utilization. However, with the advent of networks supporting 25-100 Gb/s and storage devices delivering 1,000,000 reqs/sec, the CPU has become the bottleneck trying to keep up feeding data in and out of these fast devices. The result is that data access libraries executed on single clients are often CPU-bound and cannot utilize the scale-out benefits of distributed storage systems. One attractive solution to this problem is to offload data-reducing processing and filtering tasks to the storage layer. However, modifying legacy storage systems to support compute offloading is often tedious and requires an extensive understanding of the system internals. Previous approaches re-implemented functionality of data processing frameworks and access libraries for a particular storage system, a duplication of effort that might have to be repeated for different storage systems.
+
+This paper introduces a new design paradigm that allows extending programmable object storage systems to embed existing, widely used data processing frameworks and access libraries into the storage layer with no modifications. In this approach, data processing frameworks and access libraries can evolve independently from storage systems while leveraging distributed storage systems' scale-out and availability properties. We present Skyhook, an example implementation of our design paradigm using Ceph, Apache Arrow, and Parquet. We provide a brief performance evaluation of Skyhook and discuss key results.},
+ address = {Taormina (Messina), Italy},
+ author = {Jayjeet Chakraborty and Ivo Jimenez and Sebastiaan Alvarez Rodriguez and Alexandru Uta and Jeff LeFevre and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAzLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0MvY2hha3JhYm9ydHktY2NncmlkMjIucGRmTxEBggAAAAABggACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////GGNoYWtyYWJvcnR5LWNjZ3JpZDIyLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFDAAACAD0vOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkM6Y2hha3JhYm9ydHktY2NncmlkMjIucGRmAAAOADIAGABjAGgAYQBrAHIAYQBiAG8AcgB0AHkALQBjAGMAZwByAGkAZAAyADIALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADtVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9DL2NoYWtyYWJvcnR5LWNjZ3JpZDIyLnBkZgAAEwABLwAAFQACABH//wAAAAgADQAaACQAWgAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHg},
+ booktitle = {CCGrid22},
+ date-added = {2022-04-11 19:45:31 -0700},
+ date-modified = {2022-04-11 19:57:58 -0700},
+ keywords = {papers, programmable, storage, systems, arrow, nsf1836650, nsf1705021, nsf1764102},
+ month = {May 16-19},
+ title = {Skyhook: Towards an Arrow-Native Storage System},
+ year = {2022}
+}
+
diff --git a/content/publication/chakraborty-ccgrid-22/index.md b/content/publication/chakraborty-ccgrid-22/index.md
new file mode 100644
index 00000000000..28256850d4a
--- /dev/null
+++ b/content/publication/chakraborty-ccgrid-22/index.md
@@ -0,0 +1,72 @@
+---
+# Documentation: https://wowchemy.com/docs/managing-content/
+
+title: 'Skyhook: Towards an Arrow-Native Storage System'
+subtitle: ''
+summary: ''
+authors:
+- Jayjeet Chakraborty
+- Ivo Jimenez
+- Sebastiaan Alvarez Rodriguez
+- Alexandru Uta
+- Jeff LeFevre
+- Carlos Maltzahn
+tags:
+- papers
+- programmable
+- storage
+- systems
+- arrow
+- nsf1836650
+- nsf1705021
+- nsf1764102
+categories: []
+date: '2022-05-01'
+lastmod: 2022-04-25T15:07:59-07:00
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Focal points: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight.
+image:
+ caption: ''
+ focal_point: ''
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects:
+- programmable-storage
+- declstore
+- eusocial-storage
+- skyhook
+publishDate: '2022-04-25T22:07:44.206228Z'
+publication_types:
+- '1'
+abstract: With the ever-increasing dataset sizes, several file formats such as Parquet,
+ ORC, and Avro have been developed to store data efficiently, save the network, and
+ interconnect bandwidth at the price of additional CPU utilization. However, with
+ the advent of networks supporting 25-100 Gb/s and storage devices delivering 1,000,000
+ reqs/sec, the CPU has become the bottleneck trying to keep up feeding data in and
+ out of these fast devices. The result is that data access libraries executed on
+ single clients are often CPU-bound and cannot utilize the scale-out benefits of
+ distributed storage systems. One attractive solution to this problem is to offload
+ data-reducing processing and filtering tasks to the storage layer. However, modifying
+ legacy storage systems to support compute offloading is often tedious and requires
+ an extensive understanding of the system internals. Previous approaches re-implemented
+ functionality of data processing frameworks and access libraries for a particular
+ storage system, a duplication of effort that might have to be repeated for different
+ storage systems. This paper introduces a new design paradigm that allows extending
+ programmable object storage systems to embed existing, widely used data processing
+ frameworks and access libraries into the storage layer with no modifications. In
+ this approach, data processing frameworks and access libraries can evolve independently
+ from storage systems while leveraging distributed storage systems' scale-out and
+ availability properties. We present Skyhook, an example implementation of our design
+ paradigm using Ceph, Apache Arrow, and Parquet. We provide a brief performance evaluation
+ of Skyhook and discuss key results.
+publication: '*CCGrid22*'
+---
diff --git a/content/publication/chakraborty-ecpam-20/cite.bib b/content/publication/chakraborty-ecpam-20/cite.bib
new file mode 100644
index 00000000000..fc6fc1a267d
--- /dev/null
+++ b/content/publication/chakraborty-ecpam-20/cite.bib
@@ -0,0 +1,12 @@
+@inproceedings{chakraborty:ecpam20,
+ author = {Jayjeet Chakraborty and Ivo Jimenez and Carlos Maltzahn and Arshul Mansoori and Quincy Wofford},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxBMLi4vLi4vLi4vLi4vLi4vVm9sdW1lcy9Hb29nbGVEcml2ZS9NeSBEcml2ZS9QYXBlcnMvQy9jaGFrcmFib3J0eS1lY3BhbTIwLnBkZk8RAXwAAAAAAXwAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xdjaGFrcmFib3J0eS1lY3BhbTIwLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAA/////wAACgBjdQAAAAAAAAAAAAAAAAABQwAAAgA/LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpDOmNoYWtyYWJvcnR5LWVjcGFtMjAucGRmAAAOADAAFwBjAGgAYQBrAHIAYQBiAG8AcgB0AHkALQBlAGMAcABhAG0AMgAwAC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAqL015IERyaXZlL1BhcGVycy9DL2NoYWtyYWJvcnR5LWVjcGFtMjAucGRmABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAHMAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAAB8w==},
+ bdsk-url-1 = {https://ecpannualmeeting.com/},
+ booktitle = {Poster at 2020 Exaxcale Computing Project Annual Meeting, Houston, TX, February 3-7, 2020},
+ date-added = {2020-02-05 11:34:01 -0800},
+ date-modified = {2020-02-05 11:34:01 -0800},
+ keywords = {shortpapers, reproducibility, containers, workflow, automation},
+ title = {Popper 2.0: A Container-native Workflow Execution Engine For Testing Complex Applications and Validating Scientific Claims},
+ year = {2020}
+}
+
diff --git a/content/publication/chakraborty-ecpam-20/index.md b/content/publication/chakraborty-ecpam-20/index.md
new file mode 100644
index 00000000000..13a4b057d0e
--- /dev/null
+++ b/content/publication/chakraborty-ecpam-20/index.md
@@ -0,0 +1,14 @@
+---
+title: "Popper 2.0: A Container-native Workflow Execution Engine For Testing Complex Applications and Validating Scientific Claims"
+date: 2020-01-01
+publishDate: 2020-02-05T21:50:02.450974Z
+authors: ["Jayjeet Chakraborty", "Ivo Jimenez", "Carlos Maltzahn", "Arshul Mansoori", "Quincy Wofford"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*Poster at 2020 Exaxcale Computing Project Annual Meeting, Houston, TX, February 3-7, 2020*"
+tags: ["shortpapers", "reproducibility", "containers", "workflow", "automation"]
+projects:
+- practical-reproducibility
+---
+
diff --git a/content/publication/chakraborty-sdc-21/cite.bib b/content/publication/chakraborty-sdc-21/cite.bib
new file mode 100644
index 00000000000..b2581a3e770
--- /dev/null
+++ b/content/publication/chakraborty-sdc-21/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{chakraborty:sdc21,
+ address = {Virtual},
+ author = {Jayjeet Chakraborty and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAxLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0MvY2hha3JhYm9ydHktc25pYTIxLnBkZk8RAXoAAAAAAXoAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAN+vI3hCRAAB/////xZjaGFrcmFib3J0eS1zbmlhMjEucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////3+TplAAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABQwAAAgA7LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpDOmNoYWtyYWJvcnR5LXNuaWEyMS5wZGYAAA4ALgAWAGMAaABhAGsAcgBhAGIAbwByAHQAeQAtAHMAbgBpAGEAMgAxAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA5VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvQy9jaGFrcmFib3J0eS1zbmlhMjEucGRmAAATAAEvAAAVAAIAEf//AAAACAANABoAJABYAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAdY=},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA4Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0MvY2hha3JhYm9ydHktc25pYTIxLXNsaWRlcy5wZGZPEQGUAAAAAAGUAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADfryN4QkQAAf////8dY2hha3JhYm9ydHktc25pYTIxLXNsaWRlcy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9/k6/EAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUMAAAIAQi86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6QzpjaGFrcmFib3J0eS1zbmlhMjEtc2xpZGVzLnBkZgAOADwAHQBjAGgAYQBrAHIAYQBiAG8AcgB0AHkALQBzAG4AaQBhADIAMQAtAHMAbABpAGQAZQBzAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgBAVXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvQy9jaGFrcmFib3J0eS1zbmlhMjEtc2xpZGVzLnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABfAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAfc=},
+ booktitle = {SNIA SDC 2021},
+ date-added = {2023-01-11 22:30:29 -0800},
+ date-modified = {2023-01-11 22:32:09 -0800},
+ keywords = {programmable, storage},
+ month = {September 28-29},
+ title = {SkyhookDM: An Arrow-Native Storage System},
+ year = {2021}
+}
+
diff --git a/content/publication/chakraborty-sdc-21/index.md b/content/publication/chakraborty-sdc-21/index.md
new file mode 100644
index 00000000000..352acdfe818
--- /dev/null
+++ b/content/publication/chakraborty-sdc-21/index.md
@@ -0,0 +1,12 @@
+---
+title: "SkyhookDM: An Arrow-Native Storage System"
+date: 2021-09-01
+publishDate: 2023-01-26T14:23:16.862964Z
+authors: ["Jayjeet Chakraborty", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*SNIA SDC 2021*"
+tags: ["programmable", "storage"]
+---
+
diff --git a/content/publication/chu-chep-19/cite.bib b/content/publication/chu-chep-19/cite.bib
new file mode 100644
index 00000000000..6247dadbe05
--- /dev/null
+++ b/content/publication/chu-chep-19/cite.bib
@@ -0,0 +1,15 @@
+@inproceedings{chu:chep19,
+ abstract = {Access libraries such as ROOT and HDF5 allow users to interact with datasets using high level abstractions, like coordinate systems and associated slicing operations. Unfortunately, the implementations of access libraries are based on outdated assumptions about storage systems interfaces and are generally unable to fully benefit from modern fast storage devices. For example, access libraries often implement buffering and data layout that assume that large, single-threaded sequential access patterns are causing less overall latency than small parallel random access: while this is true for spinning media, it is not true for flash media. The situation is getting worse with rapidly evolving storage devices such as non-volatile memory and ever larger datasets. Our Skyhook Dataset Mapping project explores distributed dataset mapping infrastructures that can integrate and scale out existing access libraries using Ceph's extensible object model, avoiding reimplementation or even modifications of these access libraries as much as possible. These programmable storage extensions coupled with our distributed dataset mapping techniques enable: 1) access library operations to be offloaded to storage system servers, 2) the independent evolution of access libraries and storage systems and 3) fully leveraging of the existing load balancing, elasticity, and failure management of distributed storage systems like Ceph. They also create more opportunities to conduct storage server-local optimizations specific to storage servers. For example, storage servers might include local key/value stores combined with chunk stores that require different optimizations than a local file system. As storage servers evolve to support new storage devices like non-volatile memory, these server-local optimizations can be implemented while minimizing disruptions to applications. We will report progress on the means by which distributed dataset mapping can be abstracted over particular access libraries, including access libraries for ROOT data, and how we address some of the challenges revolving around data partitioning and composability of access operations.},
+ address = {Adelaide, Australia},
+ author = {Aaron Chu and Jeff LeFevre and Carlos Maltzahn and Aldrin Montana and Peter Alvaro and Dana Robinson and Quincey Koziol},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxBDLi4vLi4vLi4vLi4vLi4vVm9sdW1lcy9Hb29nbGVEcml2ZS9NeSBEcml2ZS9QYXBlcnMvQy9jaHUtY2hlcDE5LnBkZk8RAVgAAAAAAVgAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////w5jaHUtY2hlcDE5LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAA/////wAACgBjdQAAAAAAAAAAAAAAAAABQwAAAgA2LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpDOmNodS1jaGVwMTkucGRmAA4AHgAOAGMAaAB1AC0AYwBoAGUAcAAxADkALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACEvTXkgRHJpdmUvUGFwZXJzL0MvY2h1LWNoZXAxOS5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAGoAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABxg==},
+ bdsk-url-1 = {https://indico.cern.ch/event/773049/contributions/3474413/},
+ booktitle = {24th International Conference on Computing in High Energy & Nuclear Physics},
+ date-added = {2020-01-20 16:19:51 -0800},
+ date-modified = {2020-01-20 16:46:26 -0800},
+ keywords = {papers, programmable, declarative, objectstorage},
+ month = {November 4-8},
+ title = {SkyhookDM: Mapping Scientific Datasets to Programmable Storage},
+ year = {2019}
+}
+
diff --git a/content/publication/chu-chep-19/index.md b/content/publication/chu-chep-19/index.md
new file mode 100644
index 00000000000..4493d108dce
--- /dev/null
+++ b/content/publication/chu-chep-19/index.md
@@ -0,0 +1,20 @@
+---
+title: "SkyhookDM: Mapping Scientific Datasets to Programmable Storage"
+date: 2019-11-01
+publishDate: 2020-01-21T00:53:05.251650Z
+authors: ["Aaron Chu", "Jeff LeFevre", "Carlos Maltzahn", "Aldrin Montana", "Peter Alvaro", "Dana Robinson", "Quincey Koziol"]
+publication_types: ["1"]
+abstract: "Access libraries such as ROOT and HDF5 allow users to interact with datasets using high level abstractions, like coordinate systems and associated slicing operations. Unfortunately, the implementations of access libraries are based on outdated assumptions about storage systems interfaces and are generally unable to fully benefit from modern fast storage devices. For example, access libraries often implement buffering and data layout that assume that large, single-threaded sequential access patterns are causing less overall latency than small parallel random access: while this is true for spinning media, it is not true for flash media. The situation is getting worse with rapidly evolving storage devices such as non-volatile memory and ever larger datasets. Our Skyhook Dataset Mapping project explores distributed dataset mapping infrastructures that can integrate and scale out existing access libraries using Ceph's extensible object model, avoiding reimplementation or even modifications of these access libraries as much as possible. These programmable storage extensions coupled with our distributed dataset mapping techniques enable: 1) access library operations to be offloaded to storage system servers, 2) the independent evolution of access libraries and storage systems and 3) fully leveraging of the existing load balancing, elasticity, and failure management of distributed storage systems like Ceph. They also create more opportunities to conduct storage server-local optimizations specific to storage servers. For example, storage servers might include local key/value stores combined with chunk stores that require different optimizations than a local file system. As storage servers evolve to support new storage devices like non-volatile memory, these server-local optimizations can be implemented while minimizing disruptions to applications. We will report progress on the means by which distributed dataset mapping can be abstracted over particular access libraries, including access libraries for ROOT data, and how we address some of the challenges revolving around data partitioning and composability of access operations."
+featured: false
+publication: "*24th International Conference on Computing in High Energy & Nuclear Physics*"
+tags: ["papers", "programmable", "declarative", "objectstorage"]
+links:
+- name: Abstract
+ url: https://indico.cern.ch/event/773049/contributions/3474413/
+url_slides: https://indico.cern.ch/event/773049/contributions/3474413/attachments/1936811/3213825/CHEP2019-Mapping_datasets_to_object_storage.pdf
+projects:
+- declstore
+- programmable-storage
+- eusocial-storage
+- skyhook
+---
diff --git a/content/publication/chu-epjconf-20/cite.bib b/content/publication/chu-epjconf-20/cite.bib
new file mode 100644
index 00000000000..44f2af3df88
--- /dev/null
+++ b/content/publication/chu-epjconf-20/cite.bib
@@ -0,0 +1,17 @@
+@article{chu:epjconf20,
+ abstract = {Access libraries such as ROOT and HDF5 allow users to interact with datasets using high level abstractions, like coordinate systems and associated slicing operations. Unfortunately, the implementations of access libraries are based on outdated assumptions about storage systems interfaces and are generally unable to fully benefit from modern fast storage devices. For example, access libraries often implement buffering and data layout that assume that large, single-threaded sequential access patterns are causing less overall latency than small parallel random access: while this is true for spinning media, it is not true for flash media. The situation is getting worse with rapidly evolving storage devices such as non-volatile memory and ever larger datasets. Our Skyhook Dataset Mapping project explores distributed dataset mapping infrastructures that can integrate and scale out existing access libraries using Ceph's extensible object model, avoiding reimplementation or even modifications of these access libraries as much as possible. These programmable storage extensions coupled with our distributed dataset mapping techniques enable: 1) access library operations to be offloaded to storage system servers, 2) the independent evolution of access libraries and storage systems and 3) fully leveraging of the existing load balancing, elasticity, and failure management of distributed storage systems like Ceph. They also create more opportunities to conduct storage server-local optimizations specific to storage servers. For example, storage servers might include local key/value stores combined with chunk stores that require different optimizations than a local file system. As storage servers evolve to support new storage devices like non-volatile memory, these server-local optimizations can be implemented while minimizing disruptions to applications. We will report progress on the means by which distributed dataset mapping can be abstracted over particular access libraries, including access libraries for ROOT data, and how we address some of the challenges revolving around data partitioning and composability of access operations.},
+ author = {Aaron Chu and Jeff LeFevre and Carlos Maltzahn and Aldrin Montana and Peter Alvaro and Dana Robinson and Quincey Koziol},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxBGLi4vLi4vLi4vLi4vLi4vVm9sdW1lcy9Hb29nbGVEcml2ZS9NeSBEcml2ZS9QYXBlcnMvQy9jaHUtZXBqY29uZjIwLnBkZk8RAWQAAAAAAWQAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xFjaHUtZXBqY29uZjIwLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAA/////wAACgBjdQAAAAAAAAAAAAAAAAABQwAAAgA5LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpDOmNodS1lcGpjb25mMjAucGRmAAAOACQAEQBjAGgAdQAtAGUAcABqAGMAbwBuAGYAMgAwAC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAkL015IERyaXZlL1BhcGVycy9DL2NodS1lcGpjb25mMjAucGRmABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAG0AAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAAB1Q==},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxBKLi4vLi4vLi4vLi4vLi4vVm9sdW1lcy9Hb29nbGVEcml2ZS9NeSBEcml2ZS9QYXBlcnMvQy9jaHUtY2hlcDE5LXNsaWRlcy5wZGZPEQF0AAAAAAF0AAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8VY2h1LWNoZXAxOS1zbGlkZXMucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAP////8AAAoAY3UAAAAAAAAAAAAAAAAAAUMAAAIAPS86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6QzpjaHUtY2hlcDE5LXNsaWRlcy5wZGYAAA4ALAAVAGMAaAB1AC0AYwBoAGUAcAAxADkALQBzAGwAaQBkAGUAcwAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAKC9NeSBEcml2ZS9QYXBlcnMvQy9jaHUtY2hlcDE5LXNsaWRlcy5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAcQAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHp},
+ bdsk-url-1 = {https://indico.cern.ch/event/773049/contributions/3474413/},
+ date-added = {2020-12-10 16:45:30 -0800},
+ date-modified = {2020-12-10 16:50:06 -0800},
+ journal = {EPJ Web Conf.},
+ keywords = {papers, programmable, declarative, objectstorage, nsf1836650},
+ month = {November 16},
+ number = {2020},
+ title = {Mapping Scientific Datasets to Programmable Storage},
+ volume = {245, 04037},
+ year = {2020}
+}
+
diff --git a/content/publication/chu-epjconf-20/index.md b/content/publication/chu-epjconf-20/index.md
new file mode 100644
index 00000000000..00584d820c4
--- /dev/null
+++ b/content/publication/chu-epjconf-20/index.md
@@ -0,0 +1,16 @@
+---
+title: "Mapping Scientific Datasets to Programmable Storage"
+date: 2020-11-01
+publishDate: 2021-02-21T00:24:01.226701Z
+authors: ["Aaron Chu", "Jeff LeFevre", "Carlos Maltzahn", "Aldrin Montana", "Peter Alvaro", "Dana Robinson", "Quincey Koziol"]
+publication_types: ["2"]
+abstract: "Access libraries such as ROOT and HDF5 allow users to interact with datasets using high level abstractions, like coordinate systems and associated slicing operations. Unfortunately, the implementations of access libraries are based on outdated assumptions about storage systems interfaces and are generally unable to fully benefit from modern fast storage devices. For example, access libraries often implement buffering and data layout that assume that large, single-threaded sequential access patterns are causing less overall latency than small parallel random access: while this is true for spinning media, it is not true for flash media. The situation is getting worse with rapidly evolving storage devices such as non-volatile memory and ever larger datasets. Our Skyhook Dataset Mapping project explores distributed dataset mapping infrastructures that can integrate and scale out existing access libraries using Ceph's extensible object model, avoiding reimplementation or even modifications of these access libraries as much as possible. These programmable storage extensions coupled with our distributed dataset mapping techniques enable: 1) access library operations to be offloaded to storage system servers, 2) the independent evolution of access libraries and storage systems and 3) fully leveraging of the existing load balancing, elasticity, and failure management of distributed storage systems like Ceph. They also create more opportunities to conduct storage server-local optimizations specific to storage servers. For example, storage servers might include local key/value stores combined with chunk stores that require different optimizations than a local file system. As storage servers evolve to support new storage devices like non-volatile memory, these server-local optimizations can be implemented while minimizing disruptions to applications. We will report progress on the means by which distributed dataset mapping can be abstracted over particular access libraries, including access libraries for ROOT data, and how we address some of the challenges revolving around data partitioning and composability of access operations."
+featured: false
+publication: "*EPJ Web Conf.*"
+tags: ["papers", "programmable", "declarative", "objectstorage", "nsf1836650"]
+projects:
+- declstore
+- programmable-storage
+- eusocial-storage
+- skyhook
+---
diff --git a/content/publication/chu-irishep-20-poster/cite.bib b/content/publication/chu-irishep-20-poster/cite.bib
new file mode 100644
index 00000000000..6c6558301ef
--- /dev/null
+++ b/content/publication/chu-irishep-20-poster/cite.bib
@@ -0,0 +1,13 @@
+@inproceedings{chu:irishep20poster,
+ address = {Princeton, NJ},
+ author = {Aaron Chu and Ivo Jimenez and Jeff LeFevre and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxBMLi4vLi4vLi4vLi4vLi4vVm9sdW1lcy9Hb29nbGVEcml2ZS9NeSBEcml2ZS9QYXBlcnMvQy9jaHUtaXJpc2hlcDIwcG9zdGVyLnBkZk8RAXwAAAAAAXwAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xdjaHUtaXJpc2hlcDIwcG9zdGVyLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAA/////wAACgBjdQAAAAAAAAAAAAAAAAABQwAAAgA/LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpDOmNodS1pcmlzaGVwMjBwb3N0ZXIucGRmAAAOADAAFwBjAGgAdQAtAGkAcgBpAHMAaABlAHAAMgAwAHAAbwBzAHQAZQByAC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAqL015IERyaXZlL1BhcGVycy9DL2NodS1pcmlzaGVwMjBwb3N0ZXIucGRmABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAHMAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAAB8w==},
+ booktitle = {Poster at IRIS-HEP Poster Session},
+ date-added = {2020-03-09 22:19:08 -0700},
+ date-modified = {2020-03-09 22:19:08 -0700},
+ keywords = {poster, programmable, storage, hep},
+ month = {February 27},
+ title = {SkyhookDM: Programmable Storage for Datasets},
+ year = {2020}
+}
+
diff --git a/content/publication/chu-irishep-20-poster/index.md b/content/publication/chu-irishep-20-poster/index.md
new file mode 100644
index 00000000000..95397b35b81
--- /dev/null
+++ b/content/publication/chu-irishep-20-poster/index.md
@@ -0,0 +1,16 @@
+---
+title: "SkyhookDM: Programmable Storage for Datasets"
+date: 2020-02-01
+publishDate: 2020-03-10T05:36:27.951306Z
+authors: ["Aaron Chu", "Ivo Jimenez", "Jeff LeFevre", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*Poster at IRIS-HEP Poster Session*"
+tags: ["poster", "programmable", "storage", "hep"]
+projects:
+- programmable-storage
+- declstore
+- skyhook
+url_poster: https://indico.cern.ch/event/894127/attachments/1996570/3331170/2_-_maltzahn-irishep-poster.pdf
+---
diff --git a/content/publication/crume-msst-14/cite.bib b/content/publication/crume-msst-14/cite.bib
new file mode 100644
index 00000000000..0b043fd5187
--- /dev/null
+++ b/content/publication/crume-msst-14/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{crume:msst14,
+ abstract = {Predicting access times is a crucial part of predicting hard disk drive performance. Existing approaches use white-box modeling and require intimate knowledge of the internal layout of the drive, which can take months to extract. Automatically learning this behavior is a much more desirable approach, requiring less expert knowledge, fewer assumptions, and less time. While previous research has created black-box models of hard disk drive performance, none have shown low per-request errors. A barrier to machine learning of access times has been the existence of periodic behavior with high, unknown frequencies. We identify these high frequencies with Fourier analysis and include them explicitly as input to the model. In this paper we focus on the simulation of access times for random read workloads within a single zone. We are able to automatically generate and tune request-level access time models with mean absolute error less than 0.15 ms. To our knowledge this is the first time such a fidelity has been achieved with modern disk drives using machine learning. We are confident that our approach forms the core for automatic generation of access time models that include other workloads and span across entire disk drives, but more work remains.},
+ address = {Santa Clara, CA},
+ author = {Adam Crume and Carlos Maltzahn and Lee Ward and Thomas Kroeger and Matthew Curry},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxASQy9jcnVtZS1tc3N0MTQucGRmTxEBYAAAAAABYAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////EGNydW1lLW1zc3QxNC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFDAAACADgvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOkM6Y3J1bWUtbXNzdDE0LnBkZgAOACIAEABjAHIAdQBtAGUALQBtAHMAcwB0ADEANAAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAIy9NeSBEcml2ZS9QYXBlcnMvQy9jcnVtZS1tc3N0MTQucGRmAAATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA5AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAZ0=},
+ booktitle = {MSST '14},
+ date-added = {2014-05-10 00:02:27 +0000},
+ date-modified = {2020-01-04 21:58:30 -0700},
+ keywords = {papers, machinelearning, modeling, simulation, storagemedium, autotuning},
+ month = {June 2-6},
+ title = {Automatic Generation of Behavioral Hard Disk Drive Access Time Models},
+ year = {2014}
+}
+
diff --git a/content/publication/crume-msst-14/index.md b/content/publication/crume-msst-14/index.md
new file mode 100644
index 00000000000..890bce45696
--- /dev/null
+++ b/content/publication/crume-msst-14/index.md
@@ -0,0 +1,14 @@
+---
+title: "Automatic Generation of Behavioral Hard Disk Drive Access Time Models"
+date: 2014-06-01
+publishDate: 2020-01-05T06:43:50.505679Z
+authors: ["Adam Crume", "Carlos Maltzahn", "Lee Ward", "Thomas Kroeger", "Matthew Curry"]
+publication_types: ["1"]
+abstract: "Predicting access times is a crucial part of predicting hard disk drive performance. Existing approaches use white-box modeling and require intimate knowledge of the internal layout of the drive, which can take months to extract. Automatically learning this behavior is a much more desirable approach, requiring less expert knowledge, fewer assumptions, and less time. While previous research has created black-box models of hard disk drive performance, none have shown low per-request errors. A barrier to machine learning of access times has been the existence of periodic behavior with high, unknown frequencies. We identify these high frequencies with Fourier analysis and include them explicitly as input to the model. In this paper we focus on the simulation of access times for random read workloads within a single zone. We are able to automatically generate and tune request-level access time models with mean absolute error less than 0.15 ms. To our knowledge this is the first time such a fidelity has been achieved with modern disk drives using machine learning. We are confident that our approach forms the core for automatic generation of access time models that include other workloads and span across entire disk drives, but more work remains."
+featured: false
+publication: "*MSST '14*"
+tags: ["papers", "machinelearning", "modeling", "simulation", "storagemedium", "autotuning"]
+projects:
+- storage-simulation
+---
+
diff --git a/content/publication/crume-pdsw-12/cite.bib b/content/publication/crume-pdsw-12/cite.bib
new file mode 100644
index 00000000000..14985f20c95
--- /dev/null
+++ b/content/publication/crume-pdsw-12/cite.bib
@@ -0,0 +1,16 @@
+@inproceedings{crume:pdsw12,
+ abstract = {In Hadoop mappers send data to reducers in the form of key/value pairs. The default design of Hadoop's process for transmitting this intermediate data can cause a very high overhead, especially for scientific data containing multiple variables in a multi-dimensional space. For example, for a 3D scalar field of a variable ``windspeed1'' the size of keys was 6.75 times the size of values. Much of the disk and network bandwidth of ``shuffling'' this intermediate data is consumed by repeatedly transmitting the variable name for each value. This significant waste of resources is due to an assumption fundamental to Hadoop's design that all key/values are independent. This assumption is inadequate for scientific data which is often organized in regular grids, a structure that can be described in small, constant size.
+Earlier we presented SciHadoop, a slightly modified version of Hadoop designed for processing scientific data. We reported on experiments with SciHadoop which confirm that the size of intermediate data has a significant impact on overall performance. Here we show preliminary designs of multiple lossless approaches to compressing intermediate data, one of which results in up to five orders of magnitude reduction the original key/value ratio.},
+ address = {Salt Lake City, UT},
+ author = {Adam Crume and Joe Buck and Carlos Maltzahn and Scott Brandt},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxASQy9jcnVtZS1wZHN3MTIucGRmTxEBYAAAAAABYAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////EGNydW1lLXBkc3cxMi5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFDAAACADgvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOkM6Y3J1bWUtcGRzdzEyLnBkZgAOACIAEABjAHIAdQBtAGUALQBwAGQAcwB3ADEAMgAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAIy9NeSBEcml2ZS9QYXBlcnMvQy9jcnVtZS1wZHN3MTIucGRmAAATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA5AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAZ0=},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAZQy9jcnVtZS1wZHN3MTItc2xpZGVzLnBkZk8RAXwAAAAAAXwAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xdjcnVtZS1wZHN3MTItc2xpZGVzLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABQwAAAgA/LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpDOmNydW1lLXBkc3cxMi1zbGlkZXMucGRmAAAOADAAFwBjAHIAdQBtAGUALQBwAGQAcwB3ADEAMgAtAHMAbABpAGQAZQBzAC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAqL015IERyaXZlL1BhcGVycy9DL2NydW1lLXBkc3cxMi1zbGlkZXMucGRmABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAEAAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABwA==},
+ booktitle = {PDSW'12},
+ date-added = {2012-11-02 06:02:29 +0000},
+ date-modified = {2020-01-05 06:29:22 -0700},
+ keywords = {papers, mapreduce, compression, array},
+ month = {November 12},
+ title = {Compressing Intermediate Keys between Mappers and Reducers in SciHadoop},
+ year = {2012}
+}
+
diff --git a/content/publication/crume-pdsw-12/index.md b/content/publication/crume-pdsw-12/index.md
new file mode 100644
index 00000000000..8d698963246
--- /dev/null
+++ b/content/publication/crume-pdsw-12/index.md
@@ -0,0 +1,12 @@
+---
+title: "Compressing Intermediate Keys between Mappers and Reducers in SciHadoop"
+date: 2012-11-01
+publishDate: 2020-01-05T13:33:05.959334Z
+authors: ["Adam Crume", "Joe Buck", "Carlos Maltzahn", "Scott Brandt"]
+publication_types: ["1"]
+abstract: "In Hadoop mappers send data to reducers in the form of key/value pairs. The default design of Hadoop's process for transmitting this intermediate data can cause a very high overhead, especially for scientific data containing multiple variables in a multi-dimensional space. For example, for a 3D scalar field of a variable ``windspeed1'' the size of keys was 6.75 times the size of values. Much of the disk and network bandwidth of ``shuffling'' this intermediate data is consumed by repeatedly transmitting the variable name for each value. This significant waste of resources is due to an assumption fundamental to Hadoop's design that all key/values are independent. This assumption is inadequate for scientific data which is often organized in regular grids, a structure that can be described in small, constant size. Earlier we presented SciHadoop, a slightly modified version of Hadoop designed for processing scientific data. We reported on experiments with SciHadoop which confirm that the size of intermediate data has a significant impact on overall performance. Here we show preliminary designs of multiple lossless approaches to compressing intermediate data, one of which results in up to five orders of magnitude reduction the original key/value ratio."
+featured: false
+publication: "*PDSW'12*"
+tags: ["papers", "mapreduce", "compression", "array"]
+---
+
diff --git a/content/publication/crume-pdsw-13/cite.bib b/content/publication/crume-pdsw-13/cite.bib
new file mode 100644
index 00000000000..99e2257c9bf
--- /dev/null
+++ b/content/publication/crume-pdsw-13/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{crume:pdsw13,
+ abstract = {Predicting access times is a crucial part of predicting hard disk drive performance. Existing approaches use white-box modeling and require intimate knowledge of the internal layout of the drive, which can take months to extract. Automatically learning this behavior is a much more desirable approach, requiring less expert knowledge, fewer assumptions, and less time. Others have created behavioral models of hard disk drive performance, but none have shown low per-request errors. A barrier to machine learning of access times has been the existence of periodic behavior with high, unknown frequencies. We show how hard disk drive access times can be predicted to within 0.83 ms using a neural net after these frequencies are found using Fourier analysis.},
+ address = {Denver, CO},
+ author = {Adam Crume and Carlos Maltzahn and Lee Ward and Thomas Kroeger and Matthew Curry and Ron Oldfield},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxASQy9jcnVtZS1wZHN3MTMucGRmTxEBYAAAAAABYAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////EGNydW1lLXBkc3cxMy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFDAAACADgvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOkM6Y3J1bWUtcGRzdzEzLnBkZgAOACIAEABjAHIAdQBtAGUALQBwAGQAcwB3ADEAMwAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAIy9NeSBEcml2ZS9QYXBlcnMvQy9jcnVtZS1wZHN3MTMucGRmAAATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA5AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAZ0=},
+ booktitle = {PDSW'13},
+ date-added = {2013-11-30 19:31:15 +0000},
+ date-modified = {2020-01-04 22:00:13 -0700},
+ keywords = {papers, machinelearning, performance, modeling, storagemedium, neuralnetworks},
+ month = {November 18},
+ title = {Fourier-Assisted Machine Learning of Hard Disk Drive Access Time Models},
+ year = {2013}
+}
+
diff --git a/content/publication/crume-pdsw-13/index.md b/content/publication/crume-pdsw-13/index.md
new file mode 100644
index 00000000000..75ed53e891b
--- /dev/null
+++ b/content/publication/crume-pdsw-13/index.md
@@ -0,0 +1,14 @@
+---
+title: "Fourier-Assisted Machine Learning of Hard Disk Drive Access Time Models"
+date: 2013-11-01
+publishDate: 2020-01-05T06:43:50.522924Z
+authors: ["Adam Crume", "Carlos Maltzahn", "Lee Ward", "Thomas Kroeger", "Matthew Curry", "Ron Oldfield"]
+publication_types: ["1"]
+abstract: "Predicting access times is a crucial part of predicting hard disk drive performance. Existing approaches use white-box modeling and require intimate knowledge of the internal layout of the drive, which can take months to extract. Automatically learning this behavior is a much more desirable approach, requiring less expert knowledge, fewer assumptions, and less time. Others have created behavioral models of hard disk drive performance, but none have shown low per-request errors. A barrier to machine learning of access times has been the existence of periodic behavior with high, unknown frequencies. We show how hard disk drive access times can be predicted to within 0.83 ms using a neural net after these frequencies are found using Fourier analysis."
+featured: false
+publication: "*PDSW'13*"
+tags: ["papers", "machinelearning", "performance", "modeling", "storagemedium", "neuralnetworks"]
+projects:
+- storage-simulation
+---
+
diff --git a/content/publication/crume-sc-11-poster/cite.bib b/content/publication/crume-sc-11-poster/cite.bib
new file mode 100644
index 00000000000..394de7208c1
--- /dev/null
+++ b/content/publication/crume-sc-11-poster/cite.bib
@@ -0,0 +1,13 @@
+@inproceedings{crume:sc11poster,
+ address = {Seattle, WA},
+ author = {Adam Crume and Carlos Maltzahn and Jason Cope and Sam Lang and Rob Ross and Phil Carns and Chris Carothers and Ning Liu and Curtis L. Janssen and John Bent and Stephen Eidenbenz and Meghan Wingate},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAWQy9jcnVtZS1zYzExcG9zdGVyLnBkZk8RAXAAAAAAAXAAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xRjcnVtZS1zYzExcG9zdGVyLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABQwAAAgA8LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpDOmNydW1lLXNjMTFwb3N0ZXIucGRmAA4AKgAUAGMAcgB1AG0AZQAtAHMAYwAxADEAcABvAHMAdABlAHIALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACcvTXkgRHJpdmUvUGFwZXJzL0MvY3J1bWUtc2MxMXBvc3Rlci5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAD0AAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABsQ==},
+ booktitle = {Poster Session at SC 11},
+ date-added = {2012-03-01 20:39:54 +0000},
+ date-modified = {2020-01-05 05:31:34 -0700},
+ keywords = {shortpapers, machinelearning, simulation, performance},
+ month = {November 12-18},
+ title = {FLAMBES: Evolving Fast Performance Models},
+ year = {2011}
+}
+
diff --git a/content/publication/crume-sc-11-poster/index.md b/content/publication/crume-sc-11-poster/index.md
new file mode 100644
index 00000000000..0f2ef1539fb
--- /dev/null
+++ b/content/publication/crume-sc-11-poster/index.md
@@ -0,0 +1,12 @@
+---
+title: "FLAMBES: Evolving Fast Performance Models"
+date: 2011-11-01
+publishDate: 2020-01-05T12:39:43.046414Z
+authors: ["Adam Crume", "Carlos Maltzahn", "Jason Cope", "Sam Lang", "Rob Ross", "Phil Carns", "Chris Carothers", "Ning Liu", "Curtis L. Janssen", "John Bent", "Stephen Eidenbenz", "Meghan Wingate"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*Poster Session at SC 11*"
+tags: ["shortpapers", "machinelearning", "simulation", "performance"]
+---
+
diff --git a/content/publication/crume-ucsctr-12/cite.bib b/content/publication/crume-ucsctr-12/cite.bib
new file mode 100644
index 00000000000..a078f80fd04
--- /dev/null
+++ b/content/publication/crume-ucsctr-12/cite.bib
@@ -0,0 +1,15 @@
+@techreport{crume:ucsctr12,
+ address = {Santa Cruz, CA},
+ author = {Adam Crume and Joe Buck and Noah Watkins and Carlos Maltzahn and Scott Brandt and Neoklis Polyzotis},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAUQy9jcnVtZS11Y3NjdHIxMi5wZGZPEQFoAAAAAAFoAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8SY3J1bWUtdWNzY3RyMTIucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAUMAAAIAOi86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6QzpjcnVtZS11Y3NjdHIxMi5wZGYADgAmABIAYwByAHUAbQBlAC0AdQBjAHMAYwB0AHIAMQAyAC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAlL015IERyaXZlL1BhcGVycy9DL2NydW1lLXVjc2N0cjEyLnBkZgAAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAOwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGn},
+ date-added = {2013-05-30 22:54:07 +0000},
+ date-modified = {2013-05-30 22:55:49 +0000},
+ institution = {University of California Santa Cruz},
+ keywords = {papers, compression, hadoop, semantic, structured, datamanagement, mapreduce},
+ month = {August 16},
+ number = {UCSC-SOE-12-13},
+ title = {SciHadoop Semantic Compression},
+ type = {Technical Report},
+ year = {2012}
+}
+
diff --git a/content/publication/crume-ucsctr-12/index.md b/content/publication/crume-ucsctr-12/index.md
new file mode 100644
index 00000000000..619144166de
--- /dev/null
+++ b/content/publication/crume-ucsctr-12/index.md
@@ -0,0 +1,12 @@
+---
+title: "SciHadoop Semantic Compression"
+date: 2012-08-01
+publishDate: 2020-01-05T06:43:50.540344Z
+authors: ["Adam Crume", "Joe Buck", "Noah Watkins", "Carlos Maltzahn", "Scott Brandt", "Neoklis Polyzotis"]
+publication_types: ["4"]
+abstract: ""
+featured: false
+publication: ""
+tags: ["papers", "compression", "hadoop", "semantic", "structured", "datamanagement", "mapreduce"]
+---
+
diff --git a/content/publication/crume-ucsctr-14/cite.bib b/content/publication/crume-ucsctr-14/cite.bib
new file mode 100644
index 00000000000..caeb928312e
--- /dev/null
+++ b/content/publication/crume-ucsctr-14/cite.bib
@@ -0,0 +1,16 @@
+@techreport{crume:ucsctr14,
+ abstract = {Predicting access times is a crucial part of predicting hard disk drive performance. Existing approaches use white-box modeling and require intimate knowledge of the internal layout of the drive, which can take months to extract. Automatically learning this behavior is a much more desirable approach, requiring less expert knowledge, fewer assumptions, and less time. While previous research has created black-box models of hard disk drive performance, none have shown low per-request errors. A barrier to machine learning of access times has been the existence of periodic behavior with high, unknown frequencies. We identify these high frequencies with Fourier analysis and include them explicitly as input to the model. In this paper we focus on the simulation of access times for random read workloads within a single zone. We are able to automatically generate and tune request-level access time models with mean absolute error less than 0.15 ms. To our knowledge this is the first time such a fidelity has been achieved with modern disk drives using machine learning. We are confident that our approach forms the core for automatic generation of access time models that include other workloads and span across entire disk drives, but more work remains.},
+ address = {Santa Cruz, CA},
+ author = {Adam Crume and Carlos Maltzahn and Lee Ward and Thomas Kroeger and Matthew Curry},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAUQy9jcnVtZS11Y3NjdHIxNC5wZGZPEQFoAAAAAAFoAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8SY3J1bWUtdWNzY3RyMTQucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAUMAAAIAOi86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6QzpjcnVtZS11Y3NjdHIxNC5wZGYADgAmABIAYwByAHUAbQBlAC0AdQBjAHMAYwB0AHIAMQA0AC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAlL015IERyaXZlL1BhcGVycy9DL2NydW1lLXVjc2N0cjE0LnBkZgAAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAOwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGn},
+ date-added = {2014-03-28 22:23:23 +0000},
+ date-modified = {2020-01-04 21:59:33 -0700},
+ institution = {University of California at Santa Cruz},
+ keywords = {papers, machinelearning, storagemedium, simulation, modeling, autotuning, neuralnetworks},
+ month = {March 28},
+ number = {UCSC-SOE-14-02},
+ title = {Automatic Generation of Behavioral Hard Disk Drive Access Time Models},
+ type = {Technical Report},
+ year = {2014}
+}
+
diff --git a/content/publication/crume-ucsctr-14/index.md b/content/publication/crume-ucsctr-14/index.md
new file mode 100644
index 00000000000..68d34c06a0f
--- /dev/null
+++ b/content/publication/crume-ucsctr-14/index.md
@@ -0,0 +1,12 @@
+---
+title: "Automatic Generation of Behavioral Hard Disk Drive Access Time Models"
+date: 2014-03-01
+publishDate: 2020-01-05T06:43:50.512823Z
+authors: ["Adam Crume", "Carlos Maltzahn", "Lee Ward", "Thomas Kroeger", "Matthew Curry"]
+publication_types: ["4"]
+abstract: "Predicting access times is a crucial part of predicting hard disk drive performance. Existing approaches use white-box modeling and require intimate knowledge of the internal layout of the drive, which can take months to extract. Automatically learning this behavior is a much more desirable approach, requiring less expert knowledge, fewer assumptions, and less time. While previous research has created black-box models of hard disk drive performance, none have shown low per-request errors. A barrier to machine learning of access times has been the existence of periodic behavior with high, unknown frequencies. We identify these high frequencies with Fourier analysis and include them explicitly as input to the model. In this paper we focus on the simulation of access times for random read workloads within a single zone. We are able to automatically generate and tune request-level access time models with mean absolute error less than 0.15 ms. To our knowledge this is the first time such a fidelity has been achieved with modern disk drives using machine learning. We are confident that our approach forms the core for automatic generation of access time models that include other workloads and span across entire disk drives, but more work remains."
+featured: false
+publication: ""
+tags: ["papers", "machinelearning", "storagemedium", "simulation", "modeling", "autotuning", "neuralnetworks"]
+---
+
diff --git a/content/publication/dahlgren-pdsw-19/cite.bib b/content/publication/dahlgren-pdsw-19/cite.bib
new file mode 100644
index 00000000000..fb4e02fdb7d
--- /dev/null
+++ b/content/publication/dahlgren-pdsw-19/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{dahlgren:pdsw19,
+ abstract = {In the post-Moore era, systems and devices with new architectures will arrive at a rapid rate with significant impacts on the software stack. Applications will not be able to fully benefit from new architectures unless they can delegate adapting to new devices in lower layers of the stack. In this paper we introduce physical design management which deals with the problem of identifying and executing transformations on physical designs of stored data, i.e. how data is mapped to storage abstractions like files, objects, or blocks, in order to improve performance. Physical design is traditionally placed with applications, access libraries, and databases, using hard- wired assumptions about underlying storage systems. Yet, storage systems increasingly not only contain multiple kinds of storage devices with vastly different performance profiles but also move data among those storage devices, thereby changing the benefit of a particular physical design. We advocate placing physical design management in storage, identify interesting research challenges, provide a brief description of a prototype implementation in Ceph, and discuss the results of initial experiments at scale that are replicable using Cloudlab. These experiments show performance and resource utilization trade-offs associated with choosing different physical designs and choosing to transform between physical designs.},
+ address = {Denver, CO},
+ author = {Kathryn Dahlgren and Jeff LeFevre and Ashay Shirwadkar and Ken Iizawa and Aldrin Montana and Peter Alvaro and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAVRC9kYWhsZ3Jlbi1wZHN3MTkucGRmTxEBbAAAAAABbAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////E2RhaGxncmVuLXBkc3cxOS5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFEAAACADsvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOkQ6ZGFobGdyZW4tcGRzdzE5LnBkZgAADgAoABMAZABhAGgAbABnAHIAZQBuAC0AcABkAHMAdwAxADkALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACYvTXkgRHJpdmUvUGFwZXJzL0QvZGFobGdyZW4tcGRzdzE5LnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA8AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAaw=},
+ booktitle = {4th International Parallel Data Systems Workshop (PDSW 2019, co-located with SC'19)},
+ date-added = {2019-12-26 15:35:44 -0800},
+ date-modified = {2020-01-04 21:24:17 -0700},
+ keywords = {papers, programmable, storage, datamanagement, physicaldesign},
+ month = {November 18},
+ title = {Towards Physical Design Management in Storage Systems},
+ year = {2019}
+}
+
diff --git a/content/publication/dahlgren-pdsw-19/index.md b/content/publication/dahlgren-pdsw-19/index.md
new file mode 100644
index 00000000000..4a4336583fe
--- /dev/null
+++ b/content/publication/dahlgren-pdsw-19/index.md
@@ -0,0 +1,17 @@
+---
+title: "Towards Physical Design Management in Storage Systems"
+date: 2019-11-01
+publishDate: 2020-01-05T06:43:50.411373Z
+authors: ["Kathryn Dahlgren", "Jeff LeFevre", "Ashay Shirwadkar", "Ken Iizawa", "Aldrin Montana", "Peter Alvaro", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: "In the post-Moore era, systems and devices with new architectures will arrive at a rapid rate with significant impacts on the software stack. Applications will not be able to fully benefit from new architectures unless they can delegate adapting to new devices in lower layers of the stack. In this paper we introduce physical design management which deals with the problem of identifying and executing transformations on physical designs of stored data, i.e. how data is mapped to storage abstractions like files, objects, or blocks, in order to improve performance. Physical design is traditionally placed with applications, access libraries, and databases, using hard- wired assumptions about underlying storage systems. Yet, storage systems increasingly not only contain multiple kinds of storage devices with vastly different performance profiles but also move data among those storage devices, thereby changing the benefit of a particular physical design. We advocate placing physical design management in storage, identify interesting research challenges, provide a brief description of a prototype implementation in Ceph, and discuss the results of initial experiments at scale that are replicable using Cloudlab. These experiments show performance and resource utilization trade-offs associated with choosing different physical designs and choosing to transform between physical designs."
+featured: false
+publication: "*4th International Parallel Data Systems Workshop (PDSW 2019, co-located with SC'19)*"
+url_slides: "http://www.pdsw.org/pdsw19/slides/JeffLeFevre-pdsw19.pdf"
+tags: ["papers", "programmable", "storage", "datamanagement", "physicaldesign"]
+projects:
+- programmable-storage
+- eusocial-storage
+- declstore
+- skyhook
+---
diff --git a/content/publication/david-precs-19/cite.bib b/content/publication/david-precs-19/cite.bib
new file mode 100644
index 00000000000..dd25bba8a14
--- /dev/null
+++ b/content/publication/david-precs-19/cite.bib
@@ -0,0 +1,13 @@
+@inproceedings{david:precs19,
+ abstract = {Computer network research experiments can be broadly grouped in three categories: simulated, controlled, and real-world experiments. Simulation frameworks, experiment testbeds and measurement tools, respectively, are commonly used as the platforms for carrying out network experiments. In many cases, given the nature of computer network experiments, properly configuring these platforms is a complex and time-consuming task, which makes replicating and validating research results quite challenging. This complexity can be reduced by leveraging tools that enable experiment reproducibility. In this paper, we show how a recently proposed reproducibility tool called Popper facilitates the reproduction of networking exper- iments. In particular, we detail the steps taken to reproduce results in two published articles that rely on simulations. The outcome of this exercise is a generic workflow for carrying out network simulation experiments. In addition, we briefly present two additional Popper workflows for running experiments on controlled testbeds, as well as studies that gather real-world metrics (all code is publicly available on Github). We close by providing a list of lessons we learned throughout this process.},
+ author = {Andrea David and Mariette Souppe and Ivo Jimenez and Katia Obraczka and Sam Mansfield and Kerry Veenstra and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxATRC9kYXZpZC1wcmVjczE5LnBkZk8RAWQAAAAAAWQAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xFkYXZpZC1wcmVjczE5LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABRAAAAgA5LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpEOmRhdmlkLXByZWNzMTkucGRmAAAOACQAEQBkAGEAdgBpAGQALQBwAHIAZQBjAHMAMQA5AC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAkL015IERyaXZlL1BhcGVycy9EL2RhdmlkLXByZWNzMTkucGRmABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkADoAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABog==},
+ booktitle = {P-RECS'19},
+ date-added = {2019-06-25 11:22:58 -0700},
+ date-modified = {2020-01-04 21:28:38 -0700},
+ keywords = {papers, reproducibility, networking, experience},
+ month = {June 24},
+ title = {Reproducible Computer Network Experiments: A Case Study Using Popper},
+ year = {2019}
+}
+
diff --git a/content/publication/david-precs-19/index.md b/content/publication/david-precs-19/index.md
new file mode 100644
index 00000000000..6b479a3b6ac
--- /dev/null
+++ b/content/publication/david-precs-19/index.md
@@ -0,0 +1,14 @@
+---
+title: "Reproducible Computer Network Experiments: A Case Study Using Popper"
+date: 2019-06-01
+publishDate: 2020-01-05T06:43:50.418372Z
+authors: ["Andrea David", "Mariette Souppe", "Ivo Jimenez", "Katia Obraczka", "Sam Mansfield", "Kerry Veenstra", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: "Computer network research experiments can be broadly grouped in three categories: simulated, controlled, and real-world experiments. Simulation frameworks, experiment testbeds and measurement tools, respectively, are commonly used as the platforms for carrying out network experiments. In many cases, given the nature of computer network experiments, properly configuring these platforms is a complex and time-consuming task, which makes replicating and validating research results quite challenging. This complexity can be reduced by leveraging tools that enable experiment reproducibility. In this paper, we show how a recently proposed reproducibility tool called Popper facilitates the reproduction of networking exper- iments. In particular, we detail the steps taken to reproduce results in two published articles that rely on simulations. The outcome of this exercise is a generic workflow for carrying out network simulation experiments. In addition, we briefly present two additional Popper workflows for running experiments on controlled testbeds, as well as studies that gather real-world metrics (all code is publicly available on Github). We close by providing a list of lessons we learned throughout this process."
+featured: false
+publication: "*P-RECS'19*"
+tags: ["papers", "reproducibility", "networking", "experience"]
+projects:
+- practical-reproducibility
+---
+
diff --git a/content/publication/ellis-hicss-97/cite.bib b/content/publication/ellis-hicss-97/cite.bib
new file mode 100644
index 00000000000..96acb7df396
--- /dev/null
+++ b/content/publication/ellis-hicss-97/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{ellis:hicss97,
+ abstract = {Chautauqua is an exploratory workflow management system designed and implemented within the Collaboration Technology Research group (CTRG) at the University of Colorado. This system represents a tightly knit merger of workflow technology and groupware technology. Chautauqua has been in test usage at the University of Colorado since 1995. This document discusses Chautauqua - its motivation, its design, and its implementation. Our emphasis here is on its novel features, and the techniques for implementing these features.},
+ address = {Wailea, Maui, HI},
+ author = {Clarence E. Ellis and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAVRS1GL2VsbGlzLWhpY3NzOTcucGRmTxEBagAAAAABagACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////EWVsbGlzLWhpY3NzOTcucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAANFLUYAAAIAOy86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6RS1GOmVsbGlzLWhpY3NzOTcucGRmAAAOACQAEQBlAGwAbABpAHMALQBoAGkAYwBzAHMAOQA3AC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAmL015IERyaXZlL1BhcGVycy9FLUYvZWxsaXMtaGljc3M5Ny5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAPAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGq},
+ booktitle = {30th Hawaii International Conference on System Sciences, Information System Track},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:26:44 -0700},
+ keywords = {papers, workflow, cscw},
+ month = {January},
+ title = {The Chautauqua Workflow System},
+ year = {1997}
+}
+
diff --git a/content/publication/ellis-hicss-97/index.md b/content/publication/ellis-hicss-97/index.md
new file mode 100644
index 00000000000..b2ee8ffaa8d
--- /dev/null
+++ b/content/publication/ellis-hicss-97/index.md
@@ -0,0 +1,12 @@
+---
+title: "The Chautauqua Workflow System"
+date: 1997-01-01
+publishDate: 2020-01-05T13:33:06.021579Z
+authors: ["Clarence E. Ellis", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: "Chautauqua is an exploratory workflow management system designed and implemented within the Collaboration Technology Research group (CTRG) at the University of Colorado. This system represents a tightly knit merger of workflow technology and groupware technology. Chautauqua has been in test usage at the University of Colorado since 1995. This document discusses Chautauqua - its motivation, its design, and its implementation. Our emphasis here is on its novel features, and the techniques for implementing these features."
+featured: false
+publication: "*30th Hawaii International Conference on System Sciences, Information System Track*"
+tags: ["papers", "workflow", "cscw"]
+---
+
diff --git a/content/publication/ellis-jbcs-94/cite.bib b/content/publication/ellis-jbcs-94/cite.bib
new file mode 100644
index 00000000000..54e80398aeb
--- /dev/null
+++ b/content/publication/ellis-jbcs-94/cite.bib
@@ -0,0 +1,14 @@
+@article{ellis:jbcs94,
+ author = {Clarence E. Ellis and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAURS1GL2VsbGlzLWpiY3M5NC5wZGZPEQFmAAAAAAFmAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8QZWxsaXMtamJjczk0LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAA0UtRgAAAgA6LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpFLUY6ZWxsaXMtamJjczk0LnBkZgAOACIAEABlAGwAbABpAHMALQBqAGIAYwBzADkANAAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAJS9NeSBEcml2ZS9QYXBlcnMvRS1GL2VsbGlzLWpiY3M5NC5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkADsAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABpQ==},
+ date-added = {2019-12-26 18:50:02 -0800},
+ date-modified = {2019-12-26 18:51:29 -0800},
+ journal = {Journal of the Brazilian Computer Society, Special Edition on CSCW},
+ keywords = {papers, cscw},
+ number = {1},
+ pages = {15--23},
+ title = {Collaboration with Spreadsheets},
+ volume = {1},
+ year = {1994}
+}
+
diff --git a/content/publication/ellis-jbcs-94/index.md b/content/publication/ellis-jbcs-94/index.md
new file mode 100644
index 00000000000..acd89cece64
--- /dev/null
+++ b/content/publication/ellis-jbcs-94/index.md
@@ -0,0 +1,12 @@
+---
+title: "Collaboration with Spreadsheets"
+date: 1994-01-01
+publishDate: 2020-01-05T06:43:50.392939Z
+authors: ["Clarence E. Ellis", "Carlos Maltzahn"]
+publication_types: ["2"]
+abstract: ""
+featured: false
+publication: "*Journal of the Brazilian Computer Society, Special Edition on CSCW*"
+tags: ["papers", "cscw"]
+---
+
diff --git a/content/publication/estolano-fast-08-wip/cite.bib b/content/publication/estolano-fast-08-wip/cite.bib
new file mode 100644
index 00000000000..2fa5fe808dd
--- /dev/null
+++ b/content/publication/estolano-fast-08-wip/cite.bib
@@ -0,0 +1,13 @@
+@inproceedings{estolano:fast08wip,
+ address = {San Jose, CA},
+ author = {Esteban Molina-Estolano and Carlos Maltzahn and Sage Weil and Scott Brandt},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAaRS1GL2VzdG9sYW5vLWZhc3QwOHdpcC5wZGZPEQF+AAAAAAF+AAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8WZXN0b2xhbm8tZmFzdDA4d2lwLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAA0UtRgAAAgBALzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpFLUY6ZXN0b2xhbm8tZmFzdDA4d2lwLnBkZgAOAC4AFgBlAHMAdABvAGwAYQBuAG8ALQBmAGEAcwB0ADAAOAB3AGkAcAAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAKy9NeSBEcml2ZS9QYXBlcnMvRS1GL2VzdG9sYW5vLWZhc3QwOHdpcC5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAEEAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABww==},
+ booktitle = {Work-in-Progress Session of the USENIX Conference on File and Storage Technology (FAST 2008)},
+ date-added = {2019-12-29 16:38:04 -0800},
+ date-modified = {2019-12-29 16:39:22 -0800},
+ keywords = {shortpapers, loadbalancing, objectstorage, distributed, storage},
+ month = {February 26-29},
+ title = {Dynamic Load Balancing in Ceph},
+ year = {2008}
+}
+
diff --git a/content/publication/estolano-fast-08-wip/index.md b/content/publication/estolano-fast-08-wip/index.md
new file mode 100644
index 00000000000..42e93d8ac6d
--- /dev/null
+++ b/content/publication/estolano-fast-08-wip/index.md
@@ -0,0 +1,12 @@
+---
+title: "Dynamic Load Balancing in Ceph"
+date: 2008-02-01
+publishDate: 2020-01-05T06:43:50.374116Z
+authors: ["Esteban Molina-Estolano", "Carlos Maltzahn", "Sage Weil", "Scott Brandt"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*Work-in-Progress Session of the USENIX Conference on File and Storage Technology (FAST 2008)*"
+tags: ["shortpapers", "loadbalancing", "objectstorage", "distributed", "storage"]
+---
+
diff --git a/content/publication/estolano-fast-09/cite.bib b/content/publication/estolano-fast-09/cite.bib
new file mode 100644
index 00000000000..859895ebecc
--- /dev/null
+++ b/content/publication/estolano-fast-09/cite.bib
@@ -0,0 +1,12 @@
+@inproceedings{estolano:fast09,
+ address = {San Francisco, CA},
+ author = {Esteban Molina-Estolano and Carlos Maltzahn and Scott A. Brandt and John Bent},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAXRS1GL2VzdG9sYW5vLWZhc3QwOS5wZGZPEQFyAAAAAAFyAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8TZXN0b2xhbm8tZmFzdDA5LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAA0UtRgAAAgA9LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpFLUY6ZXN0b2xhbm8tZmFzdDA5LnBkZgAADgAoABMAZQBzAHQAbwBsAGEAbgBvAC0AZgBhAHMAdAAwADkALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACgvTXkgRHJpdmUvUGFwZXJzL0UtRi9lc3RvbGFuby1mYXN0MDkucGRmABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAD4AAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABtA==},
+ booktitle = {WiP at FAST '09},
+ date-added = {2010-01-13 22:52:32 -0800},
+ date-modified = {2010-01-13 23:06:07 -0800},
+ month = {February 24-27},
+ title = {Comparing the Performance of Different Parallel File system Placement Strategies},
+ year = {2009}
+}
+
diff --git a/content/publication/estolano-fast-09/index.md b/content/publication/estolano-fast-09/index.md
new file mode 100644
index 00000000000..978c60bf515
--- /dev/null
+++ b/content/publication/estolano-fast-09/index.md
@@ -0,0 +1,13 @@
+---
+title: "Comparing the Performance of Different Parallel File system Placement Strategies"
+date: 2009-02-01
+publishDate: 2020-01-05T06:43:50.609621Z
+authors: ["Esteban Molina-Estolano", "Carlos Maltzahn", "Scott A. Brandt", "John Bent"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*WiP at FAST '09*"
+projects:
+- storage-simulation
+---
+
diff --git a/content/publication/estolano-jpcs-09/cite.bib b/content/publication/estolano-jpcs-09/cite.bib
new file mode 100644
index 00000000000..c6144976fa4
--- /dev/null
+++ b/content/publication/estolano-jpcs-09/cite.bib
@@ -0,0 +1,14 @@
+@article{estolano:jpcs09,
+ abstract = {Parallel file systems are gaining in popularity in high-end computing centers as well as commercial data centers. High-end computing systems are expected to scale exponentially and to pose new challenges to their storage scalability in terms of cost and power. To address these challenges scientists and file system designers will need a thorough understanding of the design space of parallel file systems. Yet there exist few systematic studies of parallel file system behavior at petabyte- and exabyte scale. An important reason is the significant cost of getting access to large-scale hardware to test parallel file systems. To contribute to this understanding we are building a parallel file system simulator that can simulate parallel file systems at very large scale. Our goal is to simulate petabyte-scale parallel file systems on a small cluster or even a single machine in reasonable time and fidelity. With this simulator, file system experts will be able to tune existing file systems for specific workloads, scientists and file system deployment engineers will be able to better communicate workload requirements, file system designers and researchers will be able to try out design alternatives and innovations at scale, and instructors will be able to study very large-scale parallel file system behavior in the class room. In this paper we describe our approach and provide preliminary results that are encouraging both in terms of fidelity and simulation scalability.},
+ author = {Esteban Molina-Estolano and Carlos Maltzahn and John Bent and Scott A. Brandt},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAXRS1GL2VzdG9sYW5vLWpwY3MwOS5wZGZPEQFyAAAAAAFyAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8TZXN0b2xhbm8tanBjczA5LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAA0UtRgAAAgA9LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpFLUY6ZXN0b2xhbm8tanBjczA5LnBkZgAADgAoABMAZQBzAHQAbwBsAGEAbgBvAC0AagBwAGMAcwAwADkALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACgvTXkgRHJpdmUvUGFwZXJzL0UtRi9lc3RvbGFuby1qcGNzMDkucGRmABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAD4AAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABtA==},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:02:20 -0700},
+ journal = {J. Phys.: Conf. Ser.},
+ keywords = {papers, performance, simulation, filesystems},
+ number = {012050},
+ title = {Building a Parallel File System Simulator},
+ volume = {126},
+ year = {2009}
+}
+
diff --git a/content/publication/estolano-jpcs-09/index.md b/content/publication/estolano-jpcs-09/index.md
new file mode 100644
index 00000000000..61d99e915ae
--- /dev/null
+++ b/content/publication/estolano-jpcs-09/index.md
@@ -0,0 +1,14 @@
+---
+title: "Building a Parallel File System Simulator"
+date: 2009-01-01
+publishDate: 2020-01-05T13:33:05.997766Z
+authors: ["Esteban Molina-Estolano", "Carlos Maltzahn", "John Bent", "Scott A. Brandt"]
+publication_types: ["2"]
+abstract: "Parallel file systems are gaining in popularity in high-end computing centers as well as commercial data centers. High-end computing systems are expected to scale exponentially and to pose new challenges to their storage scalability in terms of cost and power. To address these challenges scientists and file system designers will need a thorough understanding of the design space of parallel file systems. Yet there exist few systematic studies of parallel file system behavior at petabyte- and exabyte scale. An important reason is the significant cost of getting access to large-scale hardware to test parallel file systems. To contribute to this understanding we are building a parallel file system simulator that can simulate parallel file systems at very large scale. Our goal is to simulate petabyte-scale parallel file systems on a small cluster or even a single machine in reasonable time and fidelity. With this simulator, file system experts will be able to tune existing file systems for specific workloads, scientists and file system deployment engineers will be able to better communicate workload requirements, file system designers and researchers will be able to try out design alternatives and innovations at scale, and instructors will be able to study very large-scale parallel file system behavior in the class room. In this paper we describe our approach and provide preliminary results that are encouraging both in terms of fidelity and simulation scalability."
+featured: false
+publication: "*J. Phys.: Conf. Ser.*"
+tags: ["papers", "performance", "simulation", "filesystems"]
+projects:
+- storage-simulation
+---
+
diff --git a/content/publication/estolano-nsdi-10/cite.bib b/content/publication/estolano-nsdi-10/cite.bib
new file mode 100644
index 00000000000..c0e87665b62
--- /dev/null
+++ b/content/publication/estolano-nsdi-10/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{estolano:nsdi10,
+ address = {San Jose, CA},
+ author = {Esteban Molina-Estolano and Carlos Maltzahn and Ben Reed and Scott A. Brandt},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAgRS1GL2Vlc3RvbGFuLW5zZGkxMC1hYnN0cmFjdC5wZGZPEQGWAAAAAAGWAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8cZWVzdG9sYW4tbnNkaTEwLWFic3RyYWN0LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAA0UtRgAAAgBGLzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpFLUY6ZWVzdG9sYW4tbnNkaTEwLWFic3RyYWN0LnBkZgAOADoAHABlAGUAcwB0AG8AbABhAG4ALQBuAHMAZABpADEAMAAtAGEAYgBzAHQAcgBhAGMAdAAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAMS9NeSBEcml2ZS9QYXBlcnMvRS1GL2Vlc3RvbGFuLW5zZGkxMC1hYnN0cmFjdC5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAEcAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAAB4Q==},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAeRS1GL2Vlc3RvbGFuLW5zZGkxMC1wb3N0ZXIucGRmTxEBjgAAAAABjgACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////GmVlc3RvbGFuLW5zZGkxMC1wb3N0ZXIucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAANFLUYAAAIARC86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6RS1GOmVlc3RvbGFuLW5zZGkxMC1wb3N0ZXIucGRmAA4ANgAaAGUAZQBzAHQAbwBsAGEAbgAtAG4AcwBkAGkAMQAwAC0AcABvAHMAdABlAHIALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASAC8vTXkgRHJpdmUvUGFwZXJzL0UtRi9lZXN0b2xhbi1uc2RpMTAtcG9zdGVyLnBkZgAAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQARQAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHX},
+ booktitle = {Poster Session at NSDI 2010},
+ date-added = {2011-05-26 23:31:27 -0700},
+ date-modified = {2020-01-05 05:36:43 -0700},
+ keywords = {shortpapers, metadata, mapreduce, ceph},
+ month = {April 28-30},
+ title = {Haceph: Scalable Metadata Management for Hadoop using Ceph},
+ year = {2010}
+}
+
diff --git a/content/publication/estolano-nsdi-10/index.md b/content/publication/estolano-nsdi-10/index.md
new file mode 100644
index 00000000000..269eaa06ed0
--- /dev/null
+++ b/content/publication/estolano-nsdi-10/index.md
@@ -0,0 +1,12 @@
+---
+title: "Haceph: Scalable Metadata Management for Hadoop using Ceph"
+date: 2010-04-01
+publishDate: 2020-01-05T12:39:43.060103Z
+authors: ["Esteban Molina-Estolano", "Carlos Maltzahn", "Ben Reed", "Scott A. Brandt"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*Poster Session at NSDI 2010*"
+tags: ["shortpapers", "metadata", "mapreduce", "ceph"]
+---
+
diff --git a/content/publication/estolano-pdsw-09/cite.bib b/content/publication/estolano-pdsw-09/cite.bib
new file mode 100644
index 00000000000..dae667abd69
--- /dev/null
+++ b/content/publication/estolano-pdsw-09/cite.bib
@@ -0,0 +1,15 @@
+@inproceedings{estolano:pdsw09,
+ abstract = {MapReduce-tailored distributed filesystems---such as HDFS for Hadoop MapReduce---and parallel high-performance computing filesystems are tailored for considerably different workloads. The purpose of our work is to examine the performance of each filesystem when both sorts of workload run on it concurrently.
+We examine two workloads on two filesystems. For the HPC workload, we use the IOR checkpointing benchmark and the Parallel Virtual File System, Version 2 (PVFS); for Hadoop, we use an HTTP attack classifier and the CloudStore filesystem. We analyze the performance of each file system when it concurrently runs its ``native'' workload as well as the non-native workload.},
+ address = {Portland, OR},
+ author = {Esteban Molina-Estolano and Maya Gokhale and Carlos Maltzahn and John May and John Bent and Scott Brandt},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAXRS1GL2VzdG9sYW5vLXBkc3cwOS5wZGZPEQFyAAAAAAFyAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8TZXN0b2xhbm8tcGRzdzA5LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAA0UtRgAAAgA9LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpFLUY6ZXN0b2xhbm8tcGRzdzA5LnBkZgAADgAoABMAZQBzAHQAbwBsAGEAbgBvAC0AcABkAHMAdwAwADkALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACgvTXkgRHJpdmUvUGFwZXJzL0UtRi9lc3RvbGFuby1wZHN3MDkucGRmABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAD4AAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABtA==},
+ booktitle = {Proceedings of the 2009 ACM Petascale Data Storage Workshop (PDSW 09)},
+ date-added = {2010-01-03 23:04:09 -0800},
+ date-modified = {2020-01-05 05:51:32 -0700},
+ keywords = {papers, performance, hpc, mapreduce, filesystems},
+ month = {November 15},
+ title = {Mixing Hadoop and HPC Workloads on Parallel Filesystems},
+ year = {2009}
+}
+
diff --git a/content/publication/estolano-pdsw-09/index.md b/content/publication/estolano-pdsw-09/index.md
new file mode 100644
index 00000000000..6bfe556f93e
--- /dev/null
+++ b/content/publication/estolano-pdsw-09/index.md
@@ -0,0 +1,12 @@
+---
+title: "Mixing Hadoop and HPC Workloads on Parallel Filesystems"
+date: 2009-11-01
+publishDate: 2020-01-05T13:33:05.991308Z
+authors: ["Esteban Molina-Estolano", "Maya Gokhale", "Carlos Maltzahn", "John May", "John Bent", "Scott Brandt"]
+publication_types: ["1"]
+abstract: "MapReduce-tailored distributed filesystems---such as HDFS for Hadoop MapReduce---and parallel high-performance computing filesystems are tailored for considerably different workloads. The purpose of our work is to examine the performance of each filesystem when both sorts of workload run on it concurrently. We examine two workloads on two filesystems. For the HPC workload, we use the IOR checkpointing benchmark and the Parallel Virtual File System, Version 2 (PVFS); for Hadoop, we use an HTTP attack classifier and the CloudStore filesystem. We analyze the performance of each file system when it concurrently runs its ``native'' workload as well as the non-native workload."
+featured: false
+publication: "*Proceedings of the 2009 ACM Petascale Data Storage Workshop (PDSW 09)*"
+tags: ["papers", "performance", "hpc", "mapreduce", "filesystems"]
+---
+
diff --git a/content/publication/hacker-ams-16/cite.bib b/content/publication/hacker-ams-16/cite.bib
new file mode 100644
index 00000000000..ad55c46163c
--- /dev/null
+++ b/content/publication/hacker-ams-16/cite.bib
@@ -0,0 +1,12 @@
+@inproceedings{hacker:ams16,
+ author = {Josh Hacker and John Exby and Nick Chartier and David Gill and Ivo Jimenez and Carlos Maltzahn and Gretchen Mullendore},
+ bdsk-url-1 = {https://drive.google.com/file/d/0B5rZ7hI6vXv3NXNRSWFnR2QwX2s/view},
+ booktitle = {American Meteorological Society 32nd Conference on Environmental Processing Technologies},
+ date-added = {2016-10-19 08:14:20 +0000},
+ date-modified = {2019-12-26 16:07:15 -0800},
+ keywords = {papers, reproducibility, containers},
+ month = {January},
+ title = {Collaborative Research and Education with Numerical Weather Prediction Enabled by Software Containers},
+ year = {2016}
+}
+
diff --git a/content/publication/hacker-ams-16/index.md b/content/publication/hacker-ams-16/index.md
new file mode 100644
index 00000000000..34fe0aa82a9
--- /dev/null
+++ b/content/publication/hacker-ams-16/index.md
@@ -0,0 +1,14 @@
+---
+title: "Collaborative Research and Education with Numerical Weather Prediction Enabled by Software Containers"
+date: 2016-01-01
+publishDate: 2020-01-05T06:43:50.456752Z
+authors: ["Josh Hacker", "John Exby", "Nick Chartier", "David Gill", "Ivo Jimenez", "Carlos Maltzahn", "Gretchen Mullendore"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*American Meteorological Society 32nd Conference on Environmental Processing Technologies*"
+tags: ["papers", "reproducibility", "containers"]
+projects:
+- practical-reproducibility
+---
+
diff --git a/content/publication/hacker-bams-17/cite.bib b/content/publication/hacker-bams-17/cite.bib
new file mode 100644
index 00000000000..690d114460e
--- /dev/null
+++ b/content/publication/hacker-bams-17/cite.bib
@@ -0,0 +1,14 @@
+@article{hacker:bams17,
+ abstract = {Software containers can revolutionize research and education with numerical weather prediction models by easing use and guaranteeing reproducibility.},
+ author = {Joshua P. Hacker and John Exby and David Gill and Ivo Jimenez and Carlos Maltzahn and Timothy See and Gretchen Mullendore and Kathryn Fossell},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxATSC9oYWNrZXItYmFtczE3LnBkZk8RAWQAAAAAAWQAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xFoYWNrZXItYmFtczE3LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABSAAAAgA5LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpIOmhhY2tlci1iYW1zMTcucGRmAAAOACQAEQBoAGEAYwBrAGUAcgAtAGIAYQBtAHMAMQA3AC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAkL015IERyaXZlL1BhcGVycy9IL2hhY2tlci1iYW1zMTcucGRmABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkADoAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABog==},
+ date-added = {2017-08-29 05:50:47 +0000},
+ date-modified = {2020-01-04 21:40:58 -0700},
+ journal = {Bull. Amer. Meteor. Soc.},
+ keywords = {papers, containers, nwp, learning},
+ pages = {1129--1138},
+ title = {A Containerized Mesoscale Model and Analysis Toolkit to Accelerate Classroom Learning, Collaborative Research, and Uncertainty Quantification},
+ volume = {98},
+ year = {2017}
+}
+
diff --git a/content/publication/hacker-bams-17/index.md b/content/publication/hacker-bams-17/index.md
new file mode 100644
index 00000000000..0be68df133f
--- /dev/null
+++ b/content/publication/hacker-bams-17/index.md
@@ -0,0 +1,14 @@
+---
+title: "A Containerized Mesoscale Model and Analysis Toolkit to Accelerate Classroom Learning, Collaborative Research, and Uncertainty Quantification"
+date: 2017-01-01
+publishDate: 2020-01-05T06:43:50.441905Z
+authors: ["Joshua P. Hacker", "John Exby", "David Gill", "Ivo Jimenez", "Carlos Maltzahn", "Timothy See", "Gretchen Mullendore", "Kathryn Fossell"]
+publication_types: ["2"]
+abstract: "Software containers can revolutionize research and education with numerical weather prediction models by easing use and guaranteeing reproducibility."
+featured: false
+publication: "*Bull. Amer. Meteor. Soc.*"
+tags: ["papers", "containers", "nwp", "learning"]
+projects:
+- practical-reproducibility
+---
+
diff --git a/content/publication/hacker-ncar-14/cite.bib b/content/publication/hacker-ncar-14/cite.bib
new file mode 100644
index 00000000000..4630f0a6dc9
--- /dev/null
+++ b/content/publication/hacker-ncar-14/cite.bib
@@ -0,0 +1,12 @@
+@misc{hacker:ncar14,
+ author = {Joshua Hacker and Carlos Maltzahn and Gretchen Mullendore and Russ Schumacher},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxATSC9oYWNrZXItbmNhcjE0LnBkZk8RAWQAAAAAAWQAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xFoYWNrZXItbmNhcjE0LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABSAAAAgA5LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpIOmhhY2tlci1uY2FyMTQucGRmAAAOACQAEQBoAGEAYwBrAGUAcgAtAG4AYwBhAHIAMQA0AC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAkL015IERyaXZlL1BhcGVycy9IL2hhY2tlci1uY2FyMTQucGRmABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkADoAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABog==},
+ date-added = {2014-06-21 21:53:41 +0000},
+ date-modified = {2014-06-24 17:21:49 +0000},
+ howpublished = {Web page. www.rap.ucar.edu/staff/hacker/BigWeather.pdf},
+ keywords = {papers, nwp, geoscience, simulation, infrastructure},
+ month = {January},
+ title = {Big Weather - A workshop on overcoming barriers to distributed production, storage, and analysis of multi-model ensemble forecasts in support of weather prediction research and education in universities},
+ year = {2014}
+}
+
diff --git a/content/publication/hacker-ncar-14/index.md b/content/publication/hacker-ncar-14/index.md
new file mode 100644
index 00000000000..827401e74e2
--- /dev/null
+++ b/content/publication/hacker-ncar-14/index.md
@@ -0,0 +1,12 @@
+---
+title: "Big Weather - A workshop on overcoming barriers to distributed production, storage, and analysis of multi-model ensemble forecasts in support of weather prediction research and education in universities"
+date: 2014-01-01
+publishDate: 2020-01-05T06:43:50.498502Z
+authors: ["Joshua Hacker", "Carlos Maltzahn", "Gretchen Mullendore", "Russ Schumacher"]
+publication_types: ["0"]
+abstract: ""
+featured: false
+publication: ""
+tags: ["papers", "nwp", "geoscience", "simulation", "infrastructure"]
+---
+
diff --git a/content/publication/hacker-wrfws-16/cite.bib b/content/publication/hacker-wrfws-16/cite.bib
new file mode 100644
index 00000000000..8b77e1454f0
--- /dev/null
+++ b/content/publication/hacker-wrfws-16/cite.bib
@@ -0,0 +1,12 @@
+@inproceedings{hacker:wrfws16,
+ address = {Boulder, CO},
+ author = {Josh Hacker and John Exby and David Gill and Ivo Jimenez and Carlos Maltzahn and Tim See and Gretchen Mullendore},
+ bdsk-url-1 = {http://www2.mmm.ucar.edu/wrf/users/workshops/WS2016/oral_presentations/4.3.pdf},
+ booktitle = {17th annual WRF Users Workshop},
+ date-added = {2016-10-19 08:18:01 +0000},
+ date-modified = {2016-10-19 08:22:45 +0000},
+ month = {June 27 - July 2},
+ title = {Collaborative WRF-based research and education with reproducible numerical weather prediction enabled by software containers},
+ year = {2016}
+}
+
diff --git a/content/publication/hacker-wrfws-16/index.md b/content/publication/hacker-wrfws-16/index.md
new file mode 100644
index 00000000000..0b5c9715915
--- /dev/null
+++ b/content/publication/hacker-wrfws-16/index.md
@@ -0,0 +1,11 @@
+---
+title: "Collaborative WRF-based research and education with reproducible numerical weather prediction enabled by software containers"
+date: 2016-06-01
+publishDate: 2020-01-05T06:43:50.455361Z
+authors: ["Josh Hacker", "John Exby", "David Gill", "Ivo Jimenez", "Carlos Maltzahn", "Tim See", "Gretchen Mullendore"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*17th annual WRF Users Workshop*"
+---
+
diff --git a/content/publication/harrell-tpds-22/cite.bib b/content/publication/harrell-tpds-22/cite.bib
new file mode 100644
index 00000000000..7682086e39c
--- /dev/null
+++ b/content/publication/harrell-tpds-22/cite.bib
@@ -0,0 +1,16 @@
+@article{harrell:tpds22,
+ abstract = {In this special section we bring you a practice and experience effort in reproducibility for large-scale computational science at SC20. This section includes nine critiques, each by a student team that reproduced results from a paper published at SC19, during the following year's Student Cluster Competition. The paper is also included in this section and has been expanded upon, now including an analysis of the outcomes of the students' reproducibility experiments. Lastly, this special section encapsulates a variety of advances in reproducibility in the SC conference series technical program.},
+ author = {Stephen Lien Harrell and Scott Michael and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAtLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0gvaGFycmVsbC10cGRzMjIucGRmTxEBagAAAAABagACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////EmhhcnJlbGwtdHBkczIyLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFIAAACADcvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkg6aGFycmVsbC10cGRzMjIucGRmAAAOACYAEgBoAGEAcgByAGUAbABsAC0AdABwAGQAcwAyADIALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADVVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9IL2hhcnJlbGwtdHBkczIyLnBkZgAAEwABLwAAFQACABH//wAAAAgADQAaACQAVAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHC},
+ date-added = {2022-04-11 19:38:53 -0700},
+ date-modified = {2022-04-11 19:42:38 -0700},
+ journal = {IEEE Transactions on Parallel and Distributed Systems},
+ keywords = {reproducibility, conference, hpc},
+ month = {September},
+ number = {9},
+ pages = {2011--2013},
+ title = {Advancing Adoption of Reproducibility in HPC: A Preface to the Special Section},
+ volume = {33},
+ year = {2022}
+}
+
diff --git a/content/publication/harrell-tpds-22/index.md b/content/publication/harrell-tpds-22/index.md
new file mode 100644
index 00000000000..efd0b4c32aa
--- /dev/null
+++ b/content/publication/harrell-tpds-22/index.md
@@ -0,0 +1,48 @@
+---
+# Documentation: https://wowchemy.com/docs/managing-content/
+
+title: 'Advancing Adoption of Reproducibility in HPC: A Preface to the Special Section'
+subtitle: ''
+summary: ''
+authors:
+- Stephen Lien Harrell
+- Scott Michael
+- Carlos Maltzahn
+tags:
+- reproducibility
+- conference
+- hpc
+categories: []
+date: '2022-09-01'
+lastmod: 2022-04-25T15:07:59-07:00
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Focal points: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight.
+image:
+ caption: ''
+ focal_point: ''
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects:
+- practical-reproducibility
+publishDate: '2022-04-25T22:07:59.185121Z'
+publication_types:
+- '2'
+abstract: In this special section we bring you a practice and experience effort in
+ reproducibility for large-scale computational science at SC20. This section includes
+ nine critiques, each by a student team that reproduced results from a paper published
+ at SC19, during the following year's Student Cluster Competition. The paper is also
+ included in this section and has been expanded upon, now including an analysis of
+ the outcomes of the students' reproducibility experiments. Lastly, this special
+ section encapsulates a variety of advances in reproducibility in the SC conference
+ series technical program.
+publication: '*IEEE Transactions on Parallel and Distributed Systems*'
+---
diff --git a/content/publication/he-hpdc-13/cite.bib b/content/publication/he-hpdc-13/cite.bib
new file mode 100644
index 00000000000..cdbde728eb4
--- /dev/null
+++ b/content/publication/he-hpdc-13/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{he:hpdc13,
+ abstract = {The I/O bottleneck in high-performance computing is becoming worse as application data continues to grow. In this work, we explore how patterns of I/O within these applications can significantly affect the effectiveness of the underlying storage systems and how these same patterns can be utilized to improve many aspects of the I/O stack and mitigate the I/O bottleneck. We offer three main contributions in this paper. First, we develop and evaluate algorithms by which I/O patterns can be efficiently discovered and described. Second, we implement one such algorithm to reduce the metadata quantity in a virtual parallel file system by up to several orders of magnitude, thereby increasing the performance of writes and reads by up to 40 and 480 percent respectively. Third, we build a prototype file system with pattern-aware prefetching and evaluate it to show a 46 percent reduction in I/O latency. Finally, we believe that efficient pattern discovery and description, coupled with the observed predictability of complex patterns within many high-performance applications, offers significant potential to enable many additional I/O optimizations.},
+ address = {New York City, NY},
+ author = {Jun He and John Bent and Aaron Torres and Gary Grider and Garth Gibson and Carlos Maltzahn and Xian-He Sun},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAPSC9oZS1ocGRjMTMucGRmTxEBVAAAAAABVAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////DWhlLWhwZGMxMy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFIAAACADUvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOkg6aGUtaHBkYzEzLnBkZgAADgAcAA0AaABlAC0AaABwAGQAYwAxADMALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACAvTXkgRHJpdmUvUGFwZXJzL0gvaGUtaHBkYzEzLnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA2AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAY4=},
+ booktitle = {HPDC '13},
+ date-added = {2013-03-26 23:25:38 +0000},
+ date-modified = {2020-01-05 05:25:00 -0700},
+ keywords = {papers, compression, plfs, indexing, checkpointing, patterndetection},
+ month = {June 17-22},
+ title = {I/O Acceleration with Pattern Detection},
+ year = {2013}
+}
+
diff --git a/content/publication/he-hpdc-13/index.md b/content/publication/he-hpdc-13/index.md
new file mode 100644
index 00000000000..74d55d26c5e
--- /dev/null
+++ b/content/publication/he-hpdc-13/index.md
@@ -0,0 +1,12 @@
+---
+title: "I/O Acceleration with Pattern Detection"
+date: 2013-06-01
+publishDate: 2020-01-05T12:39:43.027980Z
+authors: ["Jun He", "John Bent", "Aaron Torres", "Gary Grider", "Garth Gibson", "Carlos Maltzahn", "Xian-He Sun"]
+publication_types: ["1"]
+abstract: "The I/O bottleneck in high-performance computing is becoming worse as application data continues to grow. In this work, we explore how patterns of I/O within these applications can significantly affect the effectiveness of the underlying storage systems and how these same patterns can be utilized to improve many aspects of the I/O stack and mitigate the I/O bottleneck. We offer three main contributions in this paper. First, we develop and evaluate algorithms by which I/O patterns can be efficiently discovered and described. Second, we implement one such algorithm to reduce the metadata quantity in a virtual parallel file system by up to several orders of magnitude, thereby increasing the performance of writes and reads by up to 40 and 480 percent respectively. Third, we build a prototype file system with pattern-aware prefetching and evaluate it to show a 46 percent reduction in I/O latency. Finally, we believe that efficient pattern discovery and description, coupled with the observed predictability of complex patterns within many high-performance applications, offers significant potential to enable many additional I/O optimizations."
+featured: false
+publication: "*HPDC '13*"
+tags: ["papers", "compression", "plfs", "indexing", "checkpointing", "patterndetection"]
+---
+
diff --git a/content/publication/he-pdsw-12/cite.bib b/content/publication/he-pdsw-12/cite.bib
new file mode 100644
index 00000000000..66cb6ec2019
--- /dev/null
+++ b/content/publication/he-pdsw-12/cite.bib
@@ -0,0 +1,17 @@
+@inproceedings{he:pdsw12,
+ abstract = {Checkpointing is the predominant storage driver in today's petascale supercomputers and is expected to remain as such in tomorrow's exascale supercomputers. Users typically prefer to checkpoint into a shared file yet parallel file systems often perform poorly for shared file writing. A powerful technique to address this problem is to transparently transform shared file writing into many exclusively written as is done in ADIOS and PLFS. Unfortunately, the metadata to reconstruct the fragments into the original file grows with the number of writers. As such, the current approach cannot scale to exaflop supercomputers due to the large overhead of creating and reassembling the metadata.
+In this paper, we develop and evaluate algorithms by which patterns in the PLFS metadata can be discovered and then used to replace the current metadata. Our evaluation shows that these patterns reduce the size of the metadata by several orders of magnitude, increase the performance of writes by up to 40 percent, and the performance of reads by up to 480 percent. This contribution therefore can allow current checkpointing models to survive the transition from peta- to exascale.},
+ address = {Salt Lake City, UT},
+ author = {Jun He and John Bent and Aaron Torres and Gary Grider and Garth Gibson and Carlos Maltzahn and Xian-He Sun},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAPSC9oZS1wZHN3MTIucGRmTxEBVAAAAAABVAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////DWhlLXBkc3cxMi5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFIAAACADUvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOkg6aGUtcGRzdzEyLnBkZgAADgAcAA0AaABlAC0AcABkAHMAdwAxADIALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACAvTXkgRHJpdmUvUGFwZXJzL0gvaGUtcGRzdzEyLnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA2AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAY4=},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAWSC9oZS1wZHN3MTItc2xpZGVzLnBkZk8RAXAAAAAAAXAAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xRoZS1wZHN3MTItc2xpZGVzLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABSAAAAgA8LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpIOmhlLXBkc3cxMi1zbGlkZXMucGRmAA4AKgAUAGgAZQAtAHAAZABzAHcAMQAyAC0AcwBsAGkAZABlAHMALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACcvTXkgRHJpdmUvUGFwZXJzL0gvaGUtcGRzdzEyLXNsaWRlcy5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAD0AAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABsQ==},
+ booktitle = {PDSW'12},
+ date-added = {2012-11-02 06:00:38 +0000},
+ date-modified = {2020-01-05 05:28:43 -0700},
+ keywords = {papers, compression, indexing, plfs, patterndetection, checkpointing},
+ month = {November 12},
+ read = {1},
+ title = {Discovering Structure in Unstructured I/O},
+ year = {2012}
+}
+
diff --git a/content/publication/he-pdsw-12/index.md b/content/publication/he-pdsw-12/index.md
new file mode 100644
index 00000000000..5c43572843c
--- /dev/null
+++ b/content/publication/he-pdsw-12/index.md
@@ -0,0 +1,12 @@
+---
+title: "Discovering Structure in Unstructured I/O"
+date: 2012-11-01
+publishDate: 2020-01-05T13:33:05.965659Z
+authors: ["Jun He", "John Bent", "Aaron Torres", "Gary Grider", "Garth Gibson", "Carlos Maltzahn", "Xian-He Sun"]
+publication_types: ["1"]
+abstract: "Checkpointing is the predominant storage driver in today's petascale supercomputers and is expected to remain as such in tomorrow's exascale supercomputers. Users typically prefer to checkpoint into a shared file yet parallel file systems often perform poorly for shared file writing. A powerful technique to address this problem is to transparently transform shared file writing into many exclusively written as is done in ADIOS and PLFS. Unfortunately, the metadata to reconstruct the fragments into the original file grows with the number of writers. As such, the current approach cannot scale to exaflop supercomputers due to the large overhead of creating and reassembling the metadata. In this paper, we develop and evaluate algorithms by which patterns in the PLFS metadata can be discovered and then used to replace the current metadata. Our evaluation shows that these patterns reduce the size of the metadata by several orders of magnitude, increase the performance of writes by up to 40 percent, and the performance of reads by up to 480 percent. This contribution therefore can allow current checkpointing models to survive the transition from peta- to exascale."
+featured: false
+publication: "*PDSW'12*"
+tags: ["papers", "compression", "indexing", "plfs", "patterndetection", "checkpointing"]
+---
+
diff --git a/content/publication/ionkov-lanltr-11/cite.bib b/content/publication/ionkov-lanltr-11/cite.bib
new file mode 100644
index 00000000000..8d200b4014d
--- /dev/null
+++ b/content/publication/ionkov-lanltr-11/cite.bib
@@ -0,0 +1,11 @@
+@techreport{ionkov:lanltr11,
+ author = {Ionkov, Latchesar and Lang, Michael and Maltzahn, Carlos},
+ date-added = {2012-01-24 15:38:48 +0000},
+ date-modified = {2012-01-24 15:38:48 +0000},
+ institution = {Los Alamos National Laboratory},
+ number = {LA-UR-11-11589},
+ title = {DRepl: Optimizing Access to Application Data for Analysis and Visualization},
+ type = {Technical Report},
+ year = {2011}
+}
+
diff --git a/content/publication/ionkov-lanltr-11/index.md b/content/publication/ionkov-lanltr-11/index.md
new file mode 100644
index 00000000000..d853e2fdf08
--- /dev/null
+++ b/content/publication/ionkov-lanltr-11/index.md
@@ -0,0 +1,11 @@
+---
+title: "DRepl: Optimizing Access to Application Data for Analysis and Visualization"
+date: 2011-01-01
+publishDate: 2020-01-05T12:39:43.049202Z
+authors: ["Latchesar Ionkov", "Michael Lang", "Carlos Maltzahn"]
+publication_types: ["4"]
+abstract: ""
+featured: false
+publication: ""
+---
+
diff --git a/content/publication/ionkov-msst-13/cite.bib b/content/publication/ionkov-msst-13/cite.bib
new file mode 100644
index 00000000000..e45bb27e036
--- /dev/null
+++ b/content/publication/ionkov-msst-13/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{ionkov:msst13,
+ abstract = {Until recently most scientific applications produced data that is saved, analyzed and visualized at later time. In recent years, with the large increase in the amount of data and computational power available there is demand for applications to support data access in-situ, or close-to simulation to provide application steering, analytics and visualization. Data access patterns required for these activities are usually different than the data layout produced by the application. In most of the large HPC clusters scientific data is stored in parallel file systems instead of locally on the cluster nodes. To increase reliability, the data is replicated, usually using some of the standard RAID schemes. Parallel file server nodes usually have more processing power than they need, so it is feasible to offload some of the data intensive processing to them. DRepl project replaces the standard methods of data replication with replicas having different layouts, optimized for the most commonly used access patterns. Replicas can be complete (i.e. any other replica can be reconstructed from it), or incomplete. DRepl consists of a language to describe the dataset and the necessary data layouts and tools to create a user-space file server that provides and keeps the data consistent and up to date in all optimized layouts.},
+ address = {Long Beach, CA},
+ author = {Latchesar Ionkov and Mike Lang and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAVSS1KL2lvbmtvdi1tc3N0MTMucGRmTxEBagAAAAABagACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////EWlvbmtvdi1tc3N0MTMucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAANJLUoAAAIAOy86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6SS1KOmlvbmtvdi1tc3N0MTMucGRmAAAOACQAEQBpAG8AbgBrAG8AdgAtAG0AcwBzAHQAMQAzAC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAmL015IERyaXZlL1BhcGVycy9JLUovaW9ua292LW1zc3QxMy5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAPAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGq},
+ booktitle = {MSST '13},
+ date-added = {2013-03-26 23:29:57 +0000},
+ date-modified = {2020-01-05 05:23:58 -0700},
+ keywords = {papers, redundancy, layout, hpc, storage, storagemedium, languages},
+ month = {May 6-10},
+ title = {DRepl: Optimizing Access to Application Data for Analysis and Visualization},
+ year = {2013}
+}
+
diff --git a/content/publication/ionkov-msst-13/index.md b/content/publication/ionkov-msst-13/index.md
new file mode 100644
index 00000000000..f7e07413e21
--- /dev/null
+++ b/content/publication/ionkov-msst-13/index.md
@@ -0,0 +1,14 @@
+---
+title: "DRepl: Optimizing Access to Application Data for Analysis and Visualization"
+date: 2013-05-01
+publishDate: 2020-01-05T12:39:43.023100Z
+authors: ["Latchesar Ionkov", "Mike Lang", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: "Until recently most scientific applications produced data that is saved, analyzed and visualized at later time. In recent years, with the large increase in the amount of data and computational power available there is demand for applications to support data access in-situ, or close-to simulation to provide application steering, analytics and visualization. Data access patterns required for these activities are usually different than the data layout produced by the application. In most of the large HPC clusters scientific data is stored in parallel file systems instead of locally on the cluster nodes. To increase reliability, the data is replicated, usually using some of the standard RAID schemes. Parallel file server nodes usually have more processing power than they need, so it is feasible to offload some of the data intensive processing to them. DRepl project replaces the standard methods of data replication with replicas having different layouts, optimized for the most commonly used access patterns. Replicas can be complete (i.e. any other replica can be reconstructed from it), or incomplete. DRepl consists of a language to describe the dataset and the necessary data layouts and tools to create a user-space file server that provides and keeps the data consistent and up to date in all optimized layouts."
+featured: false
+publication: "*MSST '13*"
+tags: ["papers", "redundancy", "layout", "hpc", "storage", "storagemedium", "languages"]
+projects:
+- programmable-storage
+---
+
diff --git a/content/publication/ionkov-pdsw-17/cite.bib b/content/publication/ionkov-pdsw-17/cite.bib
new file mode 100644
index 00000000000..4321a07689a
--- /dev/null
+++ b/content/publication/ionkov-pdsw-17/cite.bib
@@ -0,0 +1,17 @@
+@inproceedings{ionkov-pdsw17,
+ abstract = {Scientific workflows contain an increasing number of interacting applications, often with big disparity between the formats of data being produced and consumed by different applications. This mismatch can result in performance degradation as data retrieval causes multiple read operations (often to a remote storage system) in order to convert the data. Although some parallel filesystems and middleware libraries attempt to identify access patterns and optimize data retrieval, they frequently fail if the patterns are complex.
+The goal of ASGARD is to replace I/O operations issued to a file by the processes with a single operation that passes enough semantic information to the storage system, so it can combine (and eventually optimize) the data movement. ASGARD allows application developers to define their application's abstract dataset as well as the subsets of the data (fragments) that are created and used by the HPC codes. It uses the semantic information to generate and execute transformation rules that convert the data between the the memory layouts of the producer and consumer applications, as well as the layout on nonvolatile storage. The transformation engine implements functionality similar to the scatter/gather support available in some file systems. Since data subsets are defined during the initialization phase, i.e., well in advance from the time they are used to store and retrieve data, the storage system has multiple opportunities to optimize both the data layout and the transformation rules in order to increase the overall I/O performance.
+In order to evaluate ASGARD's performance, we added support for ASGARD's transformation rules to Ceph's object store RADOS. We created Ceph data objects that allow custom data striping based on ASGARD's fragment definitions. Our tests with the extended RADOS show up to 5 times performance improvements for writes and 10 times performance improvements for reads over collective MPI I/O.},
+ address = {Denver, CO},
+ author = {Latchesar Ionkov and Carlos Maltzahn and Michael Lang},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAVSS1KL2lvbmtvdi1wZHN3MTcucGRmTxEBagAAAAABagACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////EWlvbmtvdi1wZHN3MTcucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAANJLUoAAAIAOy86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6SS1KOmlvbmtvdi1wZHN3MTcucGRmAAAOACQAEQBpAG8AbgBrAG8AdgAtAHAAZABzAHcAMQA3AC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAmL015IERyaXZlL1BhcGVycy9JLUovaW9ua292LXBkc3cxNy5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAPAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGq},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAcSS1KL2lvbmtvdi1wZHN3MTctc2xpZGVzLnBkZk8RAYYAAAAAAYYAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xhpb25rb3YtcGRzdzE3LXNsaWRlcy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAADSS1KAAACAEIvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOkktSjppb25rb3YtcGRzdzE3LXNsaWRlcy5wZGYADgAyABgAaQBvAG4AawBvAHYALQBwAGQAcwB3ADEANwAtAHMAbABpAGQAZQBzAC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAtL015IERyaXZlL1BhcGVycy9JLUovaW9ua292LXBkc3cxNy1zbGlkZXMucGRmAAATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJABDAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAc0=},
+ booktitle = {PDSW-DISCS 2017 at SC17},
+ date-added = {2017-11-07 16:45:07 +0000},
+ date-modified = {2020-01-04 21:39:53 -0700},
+ keywords = {papers, replication, layout, language},
+ month = {Nov 13},
+ title = {Optimized Scatter/Gather Data Operations for Parallel Storage},
+ year = {2017}
+}
+
diff --git a/content/publication/ionkov-pdsw-17/index.md b/content/publication/ionkov-pdsw-17/index.md
new file mode 100644
index 00000000000..1498c90363a
--- /dev/null
+++ b/content/publication/ionkov-pdsw-17/index.md
@@ -0,0 +1,14 @@
+---
+title: "Optimized Scatter/Gather Data Operations for Parallel Storage"
+date: 2017-11-01
+publishDate: 2020-01-05T06:43:50.440696Z
+authors: ["Latchesar Ionkov", "Carlos Maltzahn", "Michael Lang"]
+publication_types: ["1"]
+abstract: "Scientific workflows contain an increasing number of interacting applications, often with big disparity between the formats of data being produced and consumed by different applications. This mismatch can result in performance degradation as data retrieval causes multiple read operations (often to a remote storage system) in order to convert the data. Although some parallel filesystems and middleware libraries attempt to identify access patterns and optimize data retrieval, they frequently fail if the patterns are complex. The goal of ASGARD is to replace I/O operations issued to a file by the processes with a single operation that passes enough semantic information to the storage system, so it can combine (and eventually optimize) the data movement. ASGARD allows application developers to define their application's abstract dataset as well as the subsets of the data (fragments) that are created and used by the HPC codes. It uses the semantic information to generate and execute transformation rules that convert the data between the the memory layouts of the producer and consumer applications, as well as the layout on nonvolatile storage. The transformation engine implements functionality similar to the scatter/gather support available in some file systems. Since data subsets are defined during the initialization phase, i.e., well in advance from the time they are used to store and retrieve data, the storage system has multiple opportunities to optimize both the data layout and the transformation rules in order to increase the overall I/O performance. In order to evaluate ASGARD's performance, we added support for ASGARD's transformation rules to Ceph's object store RADOS. We created Ceph data objects that allow custom data striping based on ASGARD's fragment definitions. Our tests with the extended RADOS show up to 5 times performance improvements for writes and 10 times performance improvements for reads over collective MPI I/O."
+featured: false
+publication: "*PDSW-DISCS 2017 at SC17*"
+tags: ["papers", "replication", "layout", "language"]
+projects:
+- programmable-storage
+---
+
diff --git a/content/publication/jarke-ijicis-92/cite.bib b/content/publication/jarke-ijicis-92/cite.bib
new file mode 100644
index 00000000000..8aae0a3a512
--- /dev/null
+++ b/content/publication/jarke-ijicis-92/cite.bib
@@ -0,0 +1,15 @@
+@article{jarke:ijicis92,
+ abstract = {Information systems support for design environments emphasizes object management and tends to neglect the growing demand for team support. Process management is often tackled by rigid technological protocols which are likely to get in the way of group productivity and quality. Group tools must be introduced in an unobtrusive way which extends current practice yet provides structure and documentation of development experiences. The concept of sharing processes allows agents to coordinate the sharing of ideas, tasks, and results by interacting protocol automata which can be dynamically adapted to situational requirements. Inconsistency is managed with equal emphasis as consistency. The sharing process approach has been implemented in a system called ConceptTalk which has been experimentally integrated with design environments for information and hypertext systems.},
+ author = {Matthias Jarke and Carlos Maltzahn and Thomas Rose},
+ bdsk-url-1 = {https://www.worldscientific.com/doi/abs/10.1142/S0218215792000076},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:25:27 -0700},
+ journal = {International Journal of Intelligent and Cooperative Information Systems},
+ keywords = {papers, sharing, cscw, datamanagement},
+ number = {1},
+ pages = {145--167},
+ title = {Sharing Processes: Team Coordination in Design Repositories},
+ volume = {1},
+ year = {1992}
+}
+
diff --git a/content/publication/jarke-ijicis-92/index.md b/content/publication/jarke-ijicis-92/index.md
new file mode 100644
index 00000000000..4fd2a4b07db
--- /dev/null
+++ b/content/publication/jarke-ijicis-92/index.md
@@ -0,0 +1,12 @@
+---
+title: "Sharing Processes: Team Coordination in Design Repositories"
+date: 1992-01-01
+publishDate: 2020-01-05T13:33:06.020535Z
+authors: ["Matthias Jarke", "Carlos Maltzahn", "Thomas Rose"]
+publication_types: ["2"]
+abstract: "Information systems support for design environments emphasizes object management and tends to neglect the growing demand for team support. Process management is often tackled by rigid technological protocols which are likely to get in the way of group productivity and quality. Group tools must be introduced in an unobtrusive way which extends current practice yet provides structure and documentation of development experiences. The concept of sharing processes allows agents to coordinate the sharing of ideas, tasks, and results by interacting protocol automata which can be dynamically adapted to situational requirements. Inconsistency is managed with equal emphasis as consistency. The sharing process approach has been implemented in a system called ConceptTalk which has been experimentally integrated with design environments for information and hypertext systems."
+featured: false
+publication: "*International Journal of Intelligent and Cooperative Information Systems*"
+tags: ["papers", "sharing", "cscw", "datamanagement"]
+---
+
diff --git a/content/publication/jia-hipc-17/cite.bib b/content/publication/jia-hipc-17/cite.bib
new file mode 100644
index 00000000000..c5a900b1bbc
--- /dev/null
+++ b/content/publication/jia-hipc-17/cite.bib
@@ -0,0 +1,15 @@
+@inproceedings{jia:hipc17,
+ abstract = {Accessing external resources (e.g., loading input data, checkpointing snapshots, and out-of-core processing) can have a significant impact on the performance of applications. However, no existing programming systems for high-performance computing directly manage and optimize external accesses. As a result, users must explicitly manage external accesses alongside their computation at the application level, which can result in both correctness and performance issues.
+We address this limitation by introducing Iris, a task-based programming model with semantics for external resources. Iris allows applications to describe their access requirements to external resources and the relationship of those accesses to the computation. Iris incorporates external I/O into a deferred execution model, reschedules external I/O to overlap I/O with computation, and reduces external I/O when possible. We evaluate Iris on three microbenchmarks representative of important workloads in HPC and a full combustion simulation, S3D. We demonstrate that the Iris implementation of S3D reduces the external I/O overhead by up to 20x, compared to the Legion and the Fortran implementations.},
+ address = {Jaipur, India},
+ author = {Zhihao Jia and Sean Treichler and Galen Shipman and Michael Bauer and Noah Watkins and Carlos Maltzahn and Pat McCormick and Alex Aiken},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxASSS1KL2ppYS1oaXBjMTcucGRmTxEBXgAAAAABXgACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////DmppYS1oaXBjMTcucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAUERGIENBUk8AAQADAAAKAGN1AAAAAAAAAAAAAAAAAANJLUoAAAIAOC86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6SS1KOmppYS1oaXBjMTcucGRmAA4AHgAOAGoAaQBhAC0AaABpAHAAYwAxADcALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACMvTXkgRHJpdmUvUGFwZXJzL0ktSi9qaWEtaGlwYzE3LnBkZgAAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAOQAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGb},
+ booktitle = {HiPC 2017},
+ date-added = {2018-04-03 18:26:23 +0000},
+ date-modified = {2020-01-04 22:56:24 -0700},
+ keywords = {papers, runtime, distributed, programming, storage},
+ month = {December 18-21},
+ title = {Integrating External Resources with a Task-Based Programming Model},
+ year = {2017}
+}
+
diff --git a/content/publication/jia-hipc-17/index.md b/content/publication/jia-hipc-17/index.md
new file mode 100644
index 00000000000..62a76d04fc0
--- /dev/null
+++ b/content/publication/jia-hipc-17/index.md
@@ -0,0 +1,14 @@
+---
+title: "Integrating External Resources with a Task-Based Programming Model"
+date: 2017-12-01
+publishDate: 2020-01-05T06:43:50.437903Z
+authors: ["Zhihao Jia", "Sean Treichler", "Galen Shipman", "Michael Bauer", "Noah Watkins", "Carlos Maltzahn", "Pat McCormick", "Alex Aiken"]
+publication_types: ["1"]
+abstract: "Accessing external resources (e.g., loading input data, checkpointing snapshots, and out-of-core processing) can have a significant impact on the performance of applications. However, no existing programming systems for high-performance computing directly manage and optimize external accesses. As a result, users must explicitly manage external accesses alongside their computation at the application level, which can result in both correctness and performance issues. We address this limitation by introducing Iris, a task-based programming model with semantics for external resources. Iris allows applications to describe their access requirements to external resources and the relationship of those accesses to the computation. Iris incorporates external I/O into a deferred execution model, reschedules external I/O to overlap I/O with computation, and reduces external I/O when possible. We evaluate Iris on three microbenchmarks representative of important workloads in HPC and a full combustion simulation, S3D. We demonstrate that the Iris implementation of S3D reduces the external I/O overhead by up to 20x, compared to the Legion and the Fortran implementations."
+featured: false
+publication: "*HiPC 2017*"
+tags: ["papers", "runtime", "distributed", "programming", "storage"]
+projects:
+- programmable-storage
+---
+
diff --git a/content/publication/jimenez-agu-18/cite.bib b/content/publication/jimenez-agu-18/cite.bib
new file mode 100644
index 00000000000..5d3e669c16d
--- /dev/null
+++ b/content/publication/jimenez-agu-18/cite.bib
@@ -0,0 +1,12 @@
+@inproceedings{jimenez:agu18,
+ author = {Ivo Jimenez and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAuLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0ktSi9qaW1lbmV6LWFndTE4LnBkZk8RAWoAAAAAAWoAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAN+vI3hCRAAB/////xFqaW1lbmV6LWFndTE4LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////3+TxDwAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAADSS1KAAACADgvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkktSjpqaW1lbmV6LWFndTE4LnBkZgAOACQAEQBqAGkAbQBlAG4AZQB6AC0AYQBnAHUAMQA4AC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA2VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvSS1KL2ppbWVuZXotYWd1MTgucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFUAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABww==},
+ booktitle = {AGU Fall Meeting},
+ date-added = {2023-01-11 22:59:55 -0800},
+ date-modified = {2023-01-11 23:06:28 -0800},
+ keywords = {reproducibility},
+ month = {December 12-14},
+ title = {Reproducible, Automated and Portable Computational and Data Science Experimentation Pipelines with Popper},
+ year = {2018}
+}
+
diff --git a/content/publication/jimenez-agu-18/index.md b/content/publication/jimenez-agu-18/index.md
new file mode 100644
index 00000000000..632dc39d5b6
--- /dev/null
+++ b/content/publication/jimenez-agu-18/index.md
@@ -0,0 +1,12 @@
+---
+title: "Reproducible, Automated and Portable Computational and Data Science Experimentation Pipelines with Popper"
+date: 2018-12-01
+publishDate: 2023-01-26T14:23:16.862082Z
+authors: ["Ivo Jimenez", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*AGU Fall Meeting*"
+tags: ["reproducibility"]
+---
+
diff --git a/content/publication/jimenez-cnert-17/cite.bib b/content/publication/jimenez-cnert-17/cite.bib
new file mode 100644
index 00000000000..179fb989524
--- /dev/null
+++ b/content/publication/jimenez-cnert-17/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{jimenez:cnert17,
+ abstract = {This paper introduces PopperCI, a continous integration (CI) service hosted at UC Santa Cruz that allows researchers to automate the end-to-end execution and validation of experiments. PopperCI assumes that experiments follow Popper, a convention for implementing experiments and writing articles following a DevOps approach that has been proposed recently. PopperCI runs experiments on public, private or government-fundend cloud infrastructures in a fully automated way. We describe how PopperCI executes experiments and present a use case that illustrates the usefulness of the service.},
+ address = {Atlanta, GA},
+ author = {Ivo Jimenez and Andrea Arpaci-Dusseau and Remzi Arpaci-Dusseau and Jay Lofstead and Carlos Maltzahn and Kathryn Mohror and Robert Ricci},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAXSS1KL2ppbWVuZXotY25lcnQxNy5wZGZPEQFyAAAAAAFyAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8TamltZW5lei1jbmVydDE3LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAA0ktSgAAAgA9LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpJLUo6amltZW5lei1jbmVydDE3LnBkZgAADgAoABMAagBpAG0AZQBuAGUAegAtAGMAbgBlAHIAdAAxADcALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACgvTXkgRHJpdmUvUGFwZXJzL0ktSi9qaW1lbmV6LWNuZXJ0MTcucGRmABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAD4AAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABtA==},
+ booktitle = {Workshop on Computer and Networking Experimental Research Using Testbeds (CNERT'17) in conjunction with IEEE INFOCOM 2017},
+ date-added = {2017-07-31 03:37:33 +0000},
+ date-modified = {2020-01-04 21:41:20 -0700},
+ keywords = {papers, reproducibility, devops},
+ month = {May 1},
+ title = {PopperCI: Automated Reproducibility Validation},
+ year = {2017}
+}
+
diff --git a/content/publication/jimenez-cnert-17/index.md b/content/publication/jimenez-cnert-17/index.md
new file mode 100644
index 00000000000..c76653ada35
--- /dev/null
+++ b/content/publication/jimenez-cnert-17/index.md
@@ -0,0 +1,14 @@
+---
+title: "PopperCI: Automated Reproducibility Validation"
+date: 2017-05-01
+publishDate: 2020-01-05T06:43:50.443602Z
+authors: ["Ivo Jimenez", "Andrea Arpaci-Dusseau", "Remzi Arpaci-Dusseau", "Jay Lofstead", "Carlos Maltzahn", "Kathryn Mohror", "Robert Ricci"]
+publication_types: ["1"]
+abstract: "This paper introduces PopperCI, a continous integration (CI) service hosted at UC Santa Cruz that allows researchers to automate the end-to-end execution and validation of experiments. PopperCI assumes that experiments follow Popper, a convention for implementing experiments and writing articles following a DevOps approach that has been proposed recently. PopperCI runs experiments on public, private or government-fundend cloud infrastructures in a fully automated way. We describe how PopperCI executes experiments and present a use case that illustrates the usefulness of the service."
+featured: false
+publication: "*Workshop on Computer and Networking Experimental Research Using Testbeds (CNERT'17) in conjunction with IEEE INFOCOM 2017*"
+tags: ["papers", "reproducibility", "devops"]
+projects:
+- practical-reproducibility
+---
+
diff --git a/content/publication/jimenez-icpe-18/cite.bib b/content/publication/jimenez-icpe-18/cite.bib
new file mode 100644
index 00000000000..27a3a9fd513
--- /dev/null
+++ b/content/publication/jimenez-icpe-18/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{jimenez:icpe18,
+ abstract = {We introduce quiho, a framework for profiling application performance that can be used in automated performance regression tests. quiho profiles an application by applying sensitivity analysis, in particular statistical regression analysis (SRA), using application-independent performance feature vectors that characterize the performance of machines. The result of the SRA, feature importance specifically, is used as a proxy to identify hardware and low-level system software behavior. The relative importance of these features serve as a performance profile of an application (termed inferred resource utilization profile or IRUP), which is used to automatically validate performance behavior across multiple revisions of an application's code base without having to instrument code or obtain performance counters. We demonstrate that quiho can successfully discover performance regressions by showing its effectiveness in profiling application performance for synthetically introduced regressions as well as those found in real-world applications.},
+ address = {Berlin, Germany},
+ author = {Ivo Jimenez and Noah Watkins and Michael Sevilla and Jay Lofstead and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAWSS1KL2ppbWVuZXotaWNwZTE4LnBkZk8RAW4AAAAAAW4AAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xJqaW1lbmV6LWljcGUxOC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAADSS1KAAACADwvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOkktSjpqaW1lbmV6LWljcGUxOC5wZGYADgAmABIAagBpAG0AZQBuAGUAegAtAGkAYwBwAGUAMQA4AC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAnL015IERyaXZlL1BhcGVycy9JLUovamltZW5lei1pY3BlMTgucGRmAAATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA9AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAa8=},
+ booktitle = {9th ACM/SPEC International Conference on Performance Engineering (ICPE 2018)},
+ date-added = {2019-12-26 15:51:19 -0800},
+ date-modified = {2020-01-04 23:15:59 -0700},
+ keywords = {papers, reproducibility, performance, testing},
+ month = {April 9-13},
+ title = {quiho: Automated Performance Regression Testing Using Inferred Resource Utilization Profiles},
+ year = {2018}
+}
+
diff --git a/content/publication/jimenez-icpe-18/index.md b/content/publication/jimenez-icpe-18/index.md
new file mode 100644
index 00000000000..503601d1593
--- /dev/null
+++ b/content/publication/jimenez-icpe-18/index.md
@@ -0,0 +1,14 @@
+---
+title: "quiho: Automated Performance Regression Testing Using Inferred Resource Utilization Profiles"
+date: 2018-04-01
+publishDate: 2020-01-05T06:43:50.407189Z
+authors: ["Ivo Jimenez", "Noah Watkins", "Michael Sevilla", "Jay Lofstead", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: "We introduce quiho, a framework for profiling application performance that can be used in automated performance regression tests. quiho profiles an application by applying sensitivity analysis, in particular statistical regression analysis (SRA), using application-independent performance feature vectors that characterize the performance of machines. The result of the SRA, feature importance specifically, is used as a proxy to identify hardware and low-level system software behavior. The relative importance of these features serve as a performance profile of an application (termed inferred resource utilization profile or IRUP), which is used to automatically validate performance behavior across multiple revisions of an application's code base without having to instrument code or obtain performance counters. We demonstrate that quiho can successfully discover performance regressions by showing its effectiveness in profiling application performance for synthetically introduced regressions as well as those found in real-world applications."
+featured: false
+publication: "*9th ACM/SPEC International Conference on Performance Engineering (ICPE 2018)*"
+tags: ["papers", "reproducibility", "performance", "testing"]
+projects:
+- practical-reproducibility
+---
+
diff --git a/content/publication/jimenez-login-16/cite.bib b/content/publication/jimenez-login-16/cite.bib
new file mode 100644
index 00000000000..809c3b32046
--- /dev/null
+++ b/content/publication/jimenez-login-16/cite.bib
@@ -0,0 +1,15 @@
+@article{jimenez:login16,
+ abstract = {Independently validating experimental results in the field of computer systems research is a challenging task. Recreating an environment that resembles the one where an experiment was originally executed is a time-consuming endeavor. In this article, we present Popper, a convention (or protocol) for conducting experiments following a DevOps approach that allows researchers to make all associated artifacts publicly available with the goal of maximizing automation in the re-execution of an experiment and validation of its results.},
+ author = {Ivo Jimenez and Michael Sevilla and Noah Watkins and Carlos Maltzahn and Jay Lofstead and Kathryn Mohror and Remzi Arpaci-Dusseau and Andrea Arpaci-Dusseau},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAXSS1KL2ppbWVuZXotbG9naW4xNi5wZGZPEQFyAAAAAAFyAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8TamltZW5lei1sb2dpbjE2LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAA0ktSgAAAgA9LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpJLUo6amltZW5lei1sb2dpbjE2LnBkZgAADgAoABMAagBpAG0AZQBuAGUAegAtAGwAbwBnAGkAbgAxADYALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACgvTXkgRHJpdmUvUGFwZXJzL0ktSi9qaW1lbmV6LWxvZ2luMTYucGRmABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAD4AAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABtA==},
+ date-added = {2017-01-17 23:58:32 +0000},
+ date-modified = {2020-01-04 21:44:35 -0700},
+ journal = {USENIX ;login:},
+ keywords = {papers, reproducibility, devops, versioning},
+ number = {4},
+ pages = {20--26},
+ title = {Standing on the Shoulders of Giants by Managing Scientific Experiments Like Software},
+ volume = {41},
+ year = {2016}
+}
+
diff --git a/content/publication/jimenez-login-16/index.md b/content/publication/jimenez-login-16/index.md
new file mode 100644
index 00000000000..d085d6b0378
--- /dev/null
+++ b/content/publication/jimenez-login-16/index.md
@@ -0,0 +1,12 @@
+---
+title: "Standing on the Shoulders of Giants by Managing Scientific Experiments Like Software"
+date: 2016-01-01
+publishDate: 2020-01-05T06:43:50.450758Z
+authors: ["Ivo Jimenez", "Michael Sevilla", "Noah Watkins", "Carlos Maltzahn", "Jay Lofstead", "Kathryn Mohror", "Remzi Arpaci-Dusseau", "Andrea Arpaci-Dusseau"]
+publication_types: ["2"]
+abstract: "Independently validating experimental results in the field of computer systems research is a challenging task. Recreating an environment that resembles the one where an experiment was originally executed is a time-consuming endeavor. In this article, we present Popper, a convention (or protocol) for conducting experiments following a DevOps approach that allows researchers to make all associated artifacts publicly available with the goal of maximizing automation in the re-execution of an experiment and validation of its results."
+featured: false
+publication: "*USENIX ;login:*"
+tags: ["papers", "reproducibility", "devops", "versioning"]
+---
+
diff --git a/content/publication/jimenez-pdsw-13-poster/cite.bib b/content/publication/jimenez-pdsw-13-poster/cite.bib
new file mode 100644
index 00000000000..3fc017bf9e8
--- /dev/null
+++ b/content/publication/jimenez-pdsw-13-poster/cite.bib
@@ -0,0 +1,13 @@
+@inproceedings{jimenez:pdsw13poster,
+ address = {Denver, CO},
+ author = {Ivo Jimenez and Carlos Maltzahn and Jai Dayal and Jay Lofstead},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAcSS1KL2ppbWVuZXotcGRzdzEzcG9zdGVyLnBkZk8RAYYAAAAAAYYAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xhqaW1lbmV6LXBkc3cxM3Bvc3Rlci5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAADSS1KAAACAEIvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOkktSjpqaW1lbmV6LXBkc3cxM3Bvc3Rlci5wZGYADgAyABgAagBpAG0AZQBuAGUAegAtAHAAZABzAHcAMQAzAHAAbwBzAHQAZQByAC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAtL015IERyaXZlL1BhcGVycy9JLUovamltZW5lei1wZHN3MTNwb3N0ZXIucGRmAAATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJABDAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAc0=},
+ booktitle = {Poster Session at PDSW 2013 at SC13},
+ date-added = {2013-12-08 21:27:53 +0000},
+ date-modified = {2020-01-04 21:59:53 -0700},
+ keywords = {shortpapers, transactions, hpc, exascale, parallel, datamanagement},
+ month = {November 17},
+ title = {Exploring Trade-offs in Transactional Parallel Data Movement},
+ year = {2013}
+}
+
diff --git a/content/publication/jimenez-pdsw-13-poster/index.md b/content/publication/jimenez-pdsw-13-poster/index.md
new file mode 100644
index 00000000000..ef1ecd6fe91
--- /dev/null
+++ b/content/publication/jimenez-pdsw-13-poster/index.md
@@ -0,0 +1,12 @@
+---
+title: "Exploring Trade-offs in Transactional Parallel Data Movement"
+date: 2013-11-01
+publishDate: 2020-01-05T06:43:50.518086Z
+authors: ["Ivo Jimenez", "Carlos Maltzahn", "Jai Dayal", "Jay Lofstead"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*Poster Session at PDSW 2013 at SC13*"
+tags: ["shortpapers", "transactions", "hpc", "exascale", "parallel", "datamanagement"]
+---
+
diff --git a/content/publication/jimenez-pdsw-15/cite.bib b/content/publication/jimenez-pdsw-15/cite.bib
new file mode 100644
index 00000000000..14d079aeada
--- /dev/null
+++ b/content/publication/jimenez-pdsw-15/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{jimenez:pdsw15,
+ abstract = {Validating experimental results in the field of storage systems is a challenging task, mainly due to the many changes in software and hardware that computational environments go through. Determining if an experiment is reproducible entails two separate tasks: re-executing the experiment and validating the results. Existing reproducibility efforts have focused on the former, envisioning techniques and infrastructures that make it easier to re-execute an experiment. In this position paper, we focus on the latter by analyzing the validation workflow that an experiment re-executioner goes through. We notice that validating results is done on the basis of experiment design and high-level goals, rather than exact quantitative metrics. Based on this insight, we introduce a declarative format for specifying the high-level components of an experiment as well as describing generic, testable conditions that serve as the basis for validation. We present a use case in the area of distributed storage systems to illustrate the usefulness of this approach.},
+ address = {Austin, TX},
+ author = {Ivo Jimenez and Carlos Maltzahn and Jay Lofstead and Kathryn Mohror and Adam Moody and Remzi Arpaci-Dusseau and Andrea Arpaci-Dusseau},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAWSS1KL2ppbWVuZXotcGRzdzE1LnBkZk8RAW4AAAAAAW4AAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xJqaW1lbmV6LXBkc3cxNS5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAADSS1KAAACADwvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOkktSjpqaW1lbmV6LXBkc3cxNS5wZGYADgAmABIAagBpAG0AZQBuAGUAegAtAHAAZABzAHcAMQA1AC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAnL015IERyaXZlL1BhcGVycy9JLUovamltZW5lei1wZHN3MTUucGRmAAATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA9AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAa8=},
+ booktitle = {PDSW'15},
+ date-added = {2018-05-15 06:28:35 +0000},
+ date-modified = {2020-01-04 23:42:08 -0700},
+ keywords = {papers, reproducibility, declarative},
+ month = {November 15},
+ title = {Tackling the Reproducibility Problem in Storage Systems Research with Declarative Experiment Specifications},
+ year = {2015}
+}
+
diff --git a/content/publication/jimenez-pdsw-15/index.md b/content/publication/jimenez-pdsw-15/index.md
new file mode 100644
index 00000000000..fd3869c5cd2
--- /dev/null
+++ b/content/publication/jimenez-pdsw-15/index.md
@@ -0,0 +1,14 @@
+---
+title: "Tackling the Reproducibility Problem in Storage Systems Research with Declarative Experiment Specifications"
+date: 2015-11-01
+publishDate: 2020-01-05T06:43:50.434250Z
+authors: ["Ivo Jimenez", "Carlos Maltzahn", "Jay Lofstead", "Kathryn Mohror", "Adam Moody", "Remzi Arpaci-Dusseau", "Andrea Arpaci-Dusseau"]
+publication_types: ["1"]
+abstract: "Validating experimental results in the field of storage systems is a challenging task, mainly due to the many changes in software and hardware that computational environments go through. Determining if an experiment is reproducible entails two separate tasks: re-executing the experiment and validating the results. Existing reproducibility efforts have focused on the former, envisioning techniques and infrastructures that make it easier to re-execute an experiment. In this position paper, we focus on the latter by analyzing the validation workflow that an experiment re-executioner goes through. We notice that validating results is done on the basis of experiment design and high-level goals, rather than exact quantitative metrics. Based on this insight, we introduce a declarative format for specifying the high-level components of an experiment as well as describing generic, testable conditions that serve as the basis for validation. We present a use case in the area of distributed storage systems to illustrate the usefulness of this approach."
+featured: false
+publication: "*PDSW'15*"
+tags: ["papers", "reproducibility", "declarative"]
+projects:
+- practical-reproducibility
+---
+
diff --git a/content/publication/jimenez-reppar-17/cite.bib b/content/publication/jimenez-reppar-17/cite.bib
new file mode 100644
index 00000000000..40602e576c3
--- /dev/null
+++ b/content/publication/jimenez-reppar-17/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{jimenez:reppar17,
+ abstract = {Independent validation of experimental results in the field of systems research is a challenging task, mainly due to differences in software and hardware in computational environments. Recreating an environment that resembles the original is difficult and time-consuming. In this paper we introduce Popper, a convention based on a set of modern open source software (OSS) development principles for generating reproducible scientific publications. Concretely, we make the case for treating an article as an OSS project following a DevOps approach and applying software engineering best-practices to manage its associated artifacts and maintain the reproducibility of its findings. Popper leverages existing cloud-computing infrastructure and DevOps tools to produce academic articles that are easy to validate and extend. We present a use case that illustrates the usefulness of this approach. We show how, by following the Popper convention, reviewers and researchers can quickly get to the point of getting results without relying on the original author's intervention.},
+ address = {Orlando, FL},
+ author = {Ivo Jimenez and Michael Sevilla and Noah Watkins and Carlos Maltzahn and Jay Lofstead and Kathryn Mohror and Andrea Arpac-Dusseau and Remzi Arpaci-Dusseau},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAYSS1KL2ppbWVuZXotcmVwcGFyMTcucGRmTxEBdgAAAAABdgACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////FGppbWVuZXotcmVwcGFyMTcucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAANJLUoAAAIAPi86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6SS1KOmppbWVuZXotcmVwcGFyMTcucGRmAA4AKgAUAGoAaQBtAGUAbgBlAHoALQByAGUAcABwAGEAcgAxADcALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACkvTXkgRHJpdmUvUGFwZXJzL0ktSi9qaW1lbmV6LXJlcHBhcjE3LnBkZgAAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAPwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAG5},
+ booktitle = {4th International Workshop on Reproducibility in Parallel Computing (REPPAR) in conjunction with IPDPS 2017},
+ date-added = {2017-07-31 03:27:58 +0000},
+ date-modified = {2020-01-04 21:41:54 -0700},
+ keywords = {papers, reproducibility, devops},
+ month = {June 2},
+ title = {The Popper Convention: Making Reproducible Systems Evaluation Practical},
+ year = {2017}
+}
+
diff --git a/content/publication/jimenez-reppar-17/index.md b/content/publication/jimenez-reppar-17/index.md
new file mode 100644
index 00000000000..4959fdddefc
--- /dev/null
+++ b/content/publication/jimenez-reppar-17/index.md
@@ -0,0 +1,14 @@
+---
+title: "The Popper Convention: Making Reproducible Systems Evaluation Practical"
+date: 2017-06-01
+publishDate: 2020-01-05T06:43:50.445042Z
+authors: ["Ivo Jimenez", "Michael Sevilla", "Noah Watkins", "Carlos Maltzahn", "Jay Lofstead", "Kathryn Mohror", "Andrea Arpac-Dusseau", "Remzi Arpaci-Dusseau"]
+publication_types: ["1"]
+abstract: "Independent validation of experimental results in the field of systems research is a challenging task, mainly due to differences in software and hardware in computational environments. Recreating an environment that resembles the original is difficult and time-consuming. In this paper we introduce Popper, a convention based on a set of modern open source software (OSS) development principles for generating reproducible scientific publications. Concretely, we make the case for treating an article as an OSS project following a DevOps approach and applying software engineering best-practices to manage its associated artifacts and maintain the reproducibility of its findings. Popper leverages existing cloud-computing infrastructure and DevOps tools to produce academic articles that are easy to validate and extend. We present a use case that illustrates the usefulness of this approach. We show how, by following the Popper convention, reviewers and researchers can quickly get to the point of getting results without relying on the original author's intervention."
+featured: false
+publication: "*4th International Workshop on Reproducibility in Parallel Computing (REPPAR) in conjunction with IPDPS 2017*"
+tags: ["papers", "reproducibility", "devops"]
+projects:
+- practical-reproducibility
+---
+
diff --git a/content/publication/jimenez-rescue-hpc-18/cite.bib b/content/publication/jimenez-rescue-hpc-18/cite.bib
new file mode 100644
index 00000000000..bd34f7194a5
--- /dev/null
+++ b/content/publication/jimenez-rescue-hpc-18/cite.bib
@@ -0,0 +1,15 @@
+@inproceedings{jimenez:rescue-hpc18,
+ abstract = {Advances in agile software delivery methodologies and tools (commonly referred to as DevOps) have not yet materialized in academic scenarios such as university, industry and government laboratories. In this position paper we make the case for Black Swan, a platform for the agile implementation, maintenance and curation of experimentation pipelines by embracing a DevOps approach.},
+ address = {Dallas, TX},
+ author = {Ivo Jimenez and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAcSS1KL2ppbWVuZXotcmVzY3VlLWhwYzE4LnBkZk8RAYYAAAAAAYYAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xhqaW1lbmV6LXJlc2N1ZS1ocGMxOC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAADSS1KAAACAEIvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOkktSjpqaW1lbmV6LXJlc2N1ZS1ocGMxOC5wZGYADgAyABgAagBpAG0AZQBuAGUAegAtAHIAZQBzAGMAdQBlAC0AaABwAGMAMQA4AC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAtL015IERyaXZlL1BhcGVycy9JLUovamltZW5lei1yZXNjdWUtaHBjMTgucGRmAAATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJABDAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAc0=},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAjSS1KL2ppbWVuZXotcmVzY3VlLWhwYzE4LXNsaWRlcy5wZGZPEQGiAAAAAAGiAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8famltZW5lei1yZXNjdWUtaHBjMTgtc2xpZGVzLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAA0ktSgAAAgBJLzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpJLUo6amltZW5lei1yZXNjdWUtaHBjMTgtc2xpZGVzLnBkZgAADgBAAB8AagBpAG0AZQBuAGUAegAtAHIAZQBzAGMAdQBlAC0AaABwAGMAMQA4AC0AcwBsAGkAZABlAHMALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASADQvTXkgRHJpdmUvUGFwZXJzL0ktSi9qaW1lbmV6LXJlc2N1ZS1ocGMxOC1zbGlkZXMucGRmABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAEoAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAAB8A==},
+ booktitle = {1st Workshop on Reproducible, Customizable and Portable Workflows for HPC (ResCuE-HPC'18, co-located with SC'18)},
+ date-added = {2019-12-26 15:45:05 -0800},
+ date-modified = {2020-01-04 21:20:33 -0700},
+ keywords = {papers, reproducibility},
+ month = {November 11},
+ title = {Spotting Black Swans With Ease: The Case for a Practical Reproducibility Platform},
+ year = {2018}
+}
+
diff --git a/content/publication/jimenez-rescue-hpc-18/index.md b/content/publication/jimenez-rescue-hpc-18/index.md
new file mode 100644
index 00000000000..c14facf59a5
--- /dev/null
+++ b/content/publication/jimenez-rescue-hpc-18/index.md
@@ -0,0 +1,15 @@
+---
+title: "Spotting Black Swans With Ease: The Case for a Practical Reproducibility Platform"
+date: 2018-11-01
+publishDate: 2020-01-05T06:43:50.408658Z
+authors: ["Ivo Jimenez", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: "Advances in agile software delivery methodologies and tools (commonly referred to as DevOps) have not yet materialized in academic scenarios such as university, industry and government laboratories. In this position paper we make the case for Black Swan, a platform for the agile implementation, maintenance and curation of experimentation pipelines by embracing a DevOps approach."
+featured: false
+publication: "*1st Workshop on Reproducible, Customizable and Portable Workflows for HPC (ResCuE-HPC'18, co-located with SC'18)*"
+url_slides: "http://rescue-hpc.org/_resources/20181111-blackswan_rescue-hpc-sc18-workshop.pdf"
+tags: ["papers", "reproducibility"]
+projects:
+- practical-reproducibility
+---
+
diff --git a/content/publication/jimenez-tinytocs-16/cite.bib b/content/publication/jimenez-tinytocs-16/cite.bib
new file mode 100644
index 00000000000..7e6cc5c55ca
--- /dev/null
+++ b/content/publication/jimenez-tinytocs-16/cite.bib
@@ -0,0 +1,15 @@
+@article{jimenez:tinytocs16,
+ abstract = {Validating experimental results in the field of computer systems is a challenging task, mainly due to the many changes in software and hardware that computational environments go through. Determining if an experiment is reproducible entails two separate tasks: re-executing the experiment and validating the results. Existing reproducibility efforts have focused on the former, envisioning techniques and infrastructures that make it easier to re-execute an experiment. By focusing on the latter and analyzing the validation workflow that an experiment re-executioner goes through, we notice that validating results is done on the basis of experiment design and high-level goals, rather than exact quantitative metrics.
+Based on this insight, we introduce a declarative format for describing the high-level components of an experiment, as well as a language for specifying generic, testable statements that serve as the basis for validation [1,2]. Our language allows to express and validate statements on top of metrics gathered at runtime. We demonstrate the feasibility of this approach by taking an experiment from an already published article and obtain the corresponding experiment specification. We show that, if we had this specification in the first place, validating the original findings would be an almost entirely automated task. If we contrast this with the current state of our practice, where it takes days or weeks (if successful) to reproduce results, we see how making experiment specifications available as part of a publication or as addendum to experimental results can significantly aid in the validation of computer systems research.
+Acknowledgements: Work performed under auspices of US DOE by LLNL contract DE-AC52- 07NA27344 ABS-684863 and by SNL contract DE-AC04-94AL85000.},
+ author = {Ivo Jimenez and Carlos Maltzahn and Jay Lofstead and Adam Moody and Kathryn Mohror and Remzi Arpaci-Dusseau and Andrea Arpaci-Dusseau},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAaSS1KL2ppbWVuZXotdGlueXRvY3MxNi5wZGZPEQF+AAAAAAF+AAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8WamltZW5lei10aW55dG9jczE2LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAA0ktSgAAAgBALzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpJLUo6amltZW5lei10aW55dG9jczE2LnBkZgAOAC4AFgBqAGkAbQBlAG4AZQB6AC0AdABpAG4AeQB0AG8AYwBzADEANgAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAKy9NeSBEcml2ZS9QYXBlcnMvSS1KL2ppbWVuZXotdGlueXRvY3MxNi5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAEEAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABww==},
+ date-added = {2019-12-26 18:43:34 -0800},
+ date-modified = {2020-01-04 21:15:26 -0700},
+ journal = {Tiny Transactions on Computer Science (TinyToCS)},
+ keywords = {papers, reproducibility, evaluation},
+ title = {I Aver: Providing Declarative Experiment Specifications Facilitates the Evaluation of Computer Systems Research},
+ volume = {4},
+ year = {2016}
+}
+
diff --git a/content/publication/jimenez-tinytocs-16/index.md b/content/publication/jimenez-tinytocs-16/index.md
new file mode 100644
index 00000000000..aa2ec028afd
--- /dev/null
+++ b/content/publication/jimenez-tinytocs-16/index.md
@@ -0,0 +1,12 @@
+---
+title: "I Aver: Providing Declarative Experiment Specifications Facilitates the Evaluation of Computer Systems Research"
+date: 2016-01-01
+publishDate: 2020-01-05T06:43:50.393642Z
+authors: ["Ivo Jimenez", "Carlos Maltzahn", "Jay Lofstead", "Adam Moody", "Kathryn Mohror", "Remzi Arpaci-Dusseau", "Andrea Arpaci-Dusseau"]
+publication_types: ["2"]
+abstract: "Validating experimental results in the field of computer systems is a challenging task, mainly due to the many changes in software and hardware that computational environments go through. Determining if an experiment is reproducible entails two separate tasks: re-executing the experiment and validating the results. Existing reproducibility efforts have focused on the former, envisioning techniques and infrastructures that make it easier to re-execute an experiment. By focusing on the latter and analyzing the validation workflow that an experiment re-executioner goes through, we notice that validating results is done on the basis of experiment design and high-level goals, rather than exact quantitative metrics. Based on this insight, we introduce a declarative format for describing the high-level components of an experiment, as well as a language for specifying generic, testable statements that serve as the basis for validation [1,2]. Our language allows to express and validate statements on top of metrics gathered at runtime. We demonstrate the feasibility of this approach by taking an experiment from an already published article and obtain the corresponding experiment specification. We show that, if we had this specification in the first place, validating the original findings would be an almost entirely automated task. If we contrast this with the current state of our practice, where it takes days or weeks (if successful) to reproduce results, we see how making experiment specifications available as part of a publication or as addendum to experimental results can significantly aid in the validation of computer systems research. Acknowledgements: Work performed under auspices of US DOE by LLNL contract DE-AC52- 07NA27344 ABS-684863 and by SNL contract DE-AC04-94AL85000."
+featured: false
+publication: "*Tiny Transactions on Computer Science (TinyToCS)*"
+tags: ["papers", "reproducibility", "evaluation"]
+---
+
diff --git a/content/publication/jimenez-ucsctr-16/cite.bib b/content/publication/jimenez-ucsctr-16/cite.bib
new file mode 100644
index 00000000000..6b05c8364e9
--- /dev/null
+++ b/content/publication/jimenez-ucsctr-16/cite.bib
@@ -0,0 +1,16 @@
+@techreport{jimenez:ucsctr16,
+ abstract = {Independent validation of experimental results in the field of parallel and distributed systems research is a challenging task, mainly due to changes and differences in software and hardware in computational environments. Recreating an environment that resembles the original systems research is difficult and time-consuming. In this paper we introduce the Popper Convention, a set of principles for producing scientific publications. Concretely, we make the case for treating an article as an open source software (OSS) project, applying software engineering best-practices to manage its associated artifacts and maintain the reproducibility of its findings. Leveraging existing cloud-computing infrastructure and modern OSS development tools to produce academic articles that are easy to validate. We present our prototype file system, GassyFS, as a use case for illustrating the usefulness of this approach. We show how, by following Popper, re-executing experiments on multiple platforms is more practical, allowing reviewers and students to quickly get to the point of getting results without relying on the author's intervention.},
+ address = {Santa Cruz, CA},
+ author = {Ivo Jimenez and Michael Sevilla and Noah Watkins and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAYSS1KL2ppbWVuZXotdWNzY3RyMTYucGRmTxEBdgAAAAABdgACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////FGppbWVuZXotdWNzY3RyMTYucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAANJLUoAAAIAPi86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6SS1KOmppbWVuZXotdWNzY3RyMTYucGRmAA4AKgAUAGoAaQBtAGUAbgBlAHoALQB1AGMAcwBjAHQAcgAxADYALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACkvTXkgRHJpdmUvUGFwZXJzL0ktSi9qaW1lbmV6LXVjc2N0cjE2LnBkZgAAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAPwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAG5},
+ date-added = {2016-08-18 05:58:51 +0000},
+ date-modified = {2020-01-04 21:49:52 -0700},
+ institution = {UC Santa Cruz},
+ keywords = {papers, reproducibility, systems, evaluation},
+ month = {May 19},
+ number = {UCSC-SOE-16-10},
+ title = {Popper: Making Reproducible Systems Performance Evaluation Practical},
+ type = {Tech. rept.},
+ year = {2016}
+}
+
diff --git a/content/publication/jimenez-ucsctr-16/index.md b/content/publication/jimenez-ucsctr-16/index.md
new file mode 100644
index 00000000000..cb7b48f56f7
--- /dev/null
+++ b/content/publication/jimenez-ucsctr-16/index.md
@@ -0,0 +1,12 @@
+---
+title: "Popper: Making Reproducible Systems Performance Evaluation Practical"
+date: 2016-05-01
+publishDate: 2020-01-05T06:43:50.463953Z
+authors: ["Ivo Jimenez", "Michael Sevilla", "Noah Watkins", "Carlos Maltzahn"]
+publication_types: ["4"]
+abstract: "Independent validation of experimental results in the field of parallel and distributed systems research is a challenging task, mainly due to changes and differences in software and hardware in computational environments. Recreating an environment that resembles the original systems research is difficult and time-consuming. In this paper we introduce the Popper Convention, a set of principles for producing scientific publications. Concretely, we make the case for treating an article as an open source software (OSS) project, applying software engineering best-practices to manage its associated artifacts and maintain the reproducibility of its findings. Leveraging existing cloud-computing infrastructure and modern OSS development tools to produce academic articles that are easy to validate. We present our prototype file system, GassyFS, as a use case for illustrating the usefulness of this approach. We show how, by following Popper, re-executing experiments on multiple platforms is more practical, allowing reviewers and students to quickly get to the point of getting results without relying on the author's intervention."
+featured: false
+publication: ""
+tags: ["papers", "reproducibility", "systems", "evaluation"]
+---
+
diff --git a/content/publication/jimenez-varsys-16/cite.bib b/content/publication/jimenez-varsys-16/cite.bib
new file mode 100644
index 00000000000..34deb415b59
--- /dev/null
+++ b/content/publication/jimenez-varsys-16/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{jimenez:varsys16,
+ abstract = {Independent validation of experimental results in the field of parallel and distributed systems research is a challenging task, mainly due to changes and differences in software and hardware in computational environments. In particular, when an experiment runs on different hardware than the one where it originally executed, predicting the differences in results is difficult. In this paper, we introduce an architecture-independent method for characterizing the performance of a machine by obtaining a profile (a vector of microbenchark results) that we use to quantify the variability between two hardware platforms. We propose the use of isolation features that OS-level virtualization offers to reduce the variability observed when validating application performance across multiple machines. Our results show that, using our variability characterization methodology, we can correctly predict the variability bounds of CPU-intensive applications, as well as reduce it by up to 2.8x if we make use of CPU bandwidth limitations, depending on the opcode mix of an application, as well as generational and architectural differences between two hardware platforms.},
+ address = {Chicago, IL},
+ author = {Ivo Jimenez and Carlos Maltzahn and Jay Lofstead and Adam Moody and Kathryn Mohror and Remzi Arpaci-Dusseau and Andrea Arpaci-Dusseau},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAYSS1KL2ppbWVuZXotdmFyc3lzMTYucGRmTxEBdgAAAAABdgACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////FGppbWVuZXotdmFyc3lzMTYucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAANJLUoAAAIAPi86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6SS1KOmppbWVuZXotdmFyc3lzMTYucGRmAA4AKgAUAGoAaQBtAGUAbgBlAHoALQB2AGEAcgBzAHkAcwAxADYALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACkvTXkgRHJpdmUvUGFwZXJzL0ktSi9qaW1lbmV6LXZhcnN5czE2LnBkZgAAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAPwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAG5},
+ booktitle = {VarSys'16},
+ date-added = {2016-05-19 13:24:07 +0000},
+ date-modified = {2020-01-04 21:50:21 -0700},
+ keywords = {papers, reproducibility,},
+ month = {May 23},
+ title = {Characterizing and Reducing Cross-Platform Performance Variability Using OS-level Virtualization},
+ year = {2016}
+}
+
diff --git a/content/publication/jimenez-varsys-16/index.md b/content/publication/jimenez-varsys-16/index.md
new file mode 100644
index 00000000000..8e21dcda038
--- /dev/null
+++ b/content/publication/jimenez-varsys-16/index.md
@@ -0,0 +1,14 @@
+---
+title: "Characterizing and Reducing Cross-Platform Performance Variability Using OS-level Virtualization"
+date: 2016-05-01
+publishDate: 2020-01-05T06:43:50.466138Z
+authors: ["Ivo Jimenez", "Carlos Maltzahn", "Jay Lofstead", "Adam Moody", "Kathryn Mohror", "Remzi Arpaci-Dusseau", "Andrea Arpaci-Dusseau"]
+publication_types: ["1"]
+abstract: "Independent validation of experimental results in the field of parallel and distributed systems research is a challenging task, mainly due to changes and differences in software and hardware in computational environments. In particular, when an experiment runs on different hardware than the one where it originally executed, predicting the differences in results is difficult. In this paper, we introduce an architecture-independent method for characterizing the performance of a machine by obtaining a profile (a vector of microbenchark results) that we use to quantify the variability between two hardware platforms. We propose the use of isolation features that OS-level virtualization offers to reduce the variability observed when validating application performance across multiple machines. Our results show that, using our variability characterization methodology, we can correctly predict the variability bounds of CPU-intensive applications, as well as reduce it by up to 2.8x if we make use of CPU bandwidth limitations, depending on the opcode mix of an application, as well as generational and architectural differences between two hardware platforms."
+featured: false
+publication: "*VarSys'16*"
+tags: ["papers", "reproducibility", ""]
+projects:
+- practical-reproducibility
+---
+
diff --git a/content/publication/jimenez-woc-15/cite.bib b/content/publication/jimenez-woc-15/cite.bib
new file mode 100644
index 00000000000..eff5b57a877
--- /dev/null
+++ b/content/publication/jimenez-woc-15/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{jimenez:woc15,
+ abstract = {Evaluating experimental results in the field of com- puter systems is a challenging task, mainly due to the many changes in software and hardware that computational environ- ments go through. In this position paper, we analyze salient features of container technology that, if leveraged correctly, can help reduce the complexity of reproducing experiments in systems research. We present a use case in the area of distributed storage systems to illustrate the extensions that we envision, mainly in terms of container management infrastructure. We also discuss the benefits and limitations of using containers as a way of reproducing research in other areas of experimental systems research.},
+ address = {Tempe, AZ},
+ author = {Ivo Jimenez and Carlos Maltzahn and Adam Moody and Kathryn Mohror and Jay Lofstead and Remzi Arpaci-Dusseau and Andrea Arpaci-Dusseau},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAVSS1KL2ppbWVuZXotd29jMTUucGRmTxEBagAAAAABagACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////EWppbWVuZXotd29jMTUucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAANJLUoAAAIAOy86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6SS1KOmppbWVuZXotd29jMTUucGRmAAAOACQAEQBqAGkAbQBlAG4AZQB6AC0AdwBvAGMAMQA1AC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAmL015IERyaXZlL1BhcGVycy9JLUovamltZW5lei13b2MxNS5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAPAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGq},
+ booktitle = {First Workshop on Containers (WoC 2015) (Workshop co-located with IEEE International Conference on Cloud Engineering - IC2E 2015)},
+ date-added = {2019-12-26 16:08:16 -0800},
+ date-modified = {2020-01-04 21:19:26 -0700},
+ keywords = {papers, reproduibility, containers},
+ month = {March 9-13},
+ title = {The Role of Container Technology in Reproducible Computer Systems Research},
+ year = {2015}
+}
+
diff --git a/content/publication/jimenez-woc-15/index.md b/content/publication/jimenez-woc-15/index.md
new file mode 100644
index 00000000000..914591344e6
--- /dev/null
+++ b/content/publication/jimenez-woc-15/index.md
@@ -0,0 +1,14 @@
+---
+title: "The Role of Container Technology in Reproducible Computer Systems Research"
+date: 2015-03-01
+publishDate: 2020-01-05T06:43:50.404631Z
+authors: ["Ivo Jimenez", "Carlos Maltzahn", "Adam Moody", "Kathryn Mohror", "Jay Lofstead", "Remzi Arpaci-Dusseau", "Andrea Arpaci-Dusseau"]
+publication_types: ["1"]
+abstract: "Evaluating experimental results in the field of com- puter systems is a challenging task, mainly due to the many changes in software and hardware that computational environ- ments go through. In this position paper, we analyze salient features of container technology that, if leveraged correctly, can help reduce the complexity of reproducing experiments in systems research. We present a use case in the area of distributed storage systems to illustrate the extensions that we envision, mainly in terms of container management infrastructure. We also discuss the benefits and limitations of using containers as a way of reproducing research in other areas of experimental systems research."
+featured: false
+publication: "*First Workshop on Containers (WoC 2015) (Workshop co-located with IEEE International Conference on Cloud Engineering - IC2E 2015)*"
+tags: ["papers", "reproduibility", "containers"]
+projects:
+- practical-reproducibility
+---
+
diff --git a/content/publication/jimenez-xldb-18/cite.bib b/content/publication/jimenez-xldb-18/cite.bib
new file mode 100644
index 00000000000..ccffd4075c1
--- /dev/null
+++ b/content/publication/jimenez-xldb-18/cite.bib
@@ -0,0 +1,12 @@
+@inproceedings{jimenez:xldb18,
+ address = {Stanford, CA},
+ author = {Ivo Jimenez and Carlos Maltzahn},
+ booktitle = {Lightning Talk and Poster Session at the 11th Extremely Large Databases Conference (XLDB)},
+ date-added = {2019-12-26 19:14:42 -0800},
+ date-modified = {2019-12-29 16:35:19 -0800},
+ keywords = {shortpapers, reproducibility},
+ month = {April 30},
+ title = {Reproducible Computational and Data-Intensive Experimentation Pipelines with Popper},
+ year = {2018}
+}
+
diff --git a/content/publication/jimenez-xldb-18/index.md b/content/publication/jimenez-xldb-18/index.md
new file mode 100644
index 00000000000..326b3d0da56
--- /dev/null
+++ b/content/publication/jimenez-xldb-18/index.md
@@ -0,0 +1,12 @@
+---
+title: "Reproducible Computational and Data-Intensive Experimentation Pipelines with Popper"
+date: 2018-04-01
+publishDate: 2020-01-05T06:43:50.388523Z
+authors: ["Ivo Jimenez", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*Lightning Talk and Poster Session at the 11th Extremely Large Databases Conference (XLDB)*"
+tags: ["shortpapers", "reproducibility"]
+---
+
diff --git a/content/publication/kaldewey-fast-08-wip/cite.bib b/content/publication/kaldewey-fast-08-wip/cite.bib
new file mode 100644
index 00000000000..4f57ec8fd1c
--- /dev/null
+++ b/content/publication/kaldewey-fast-08-wip/cite.bib
@@ -0,0 +1,13 @@
+@inproceedings{kaldewey:fast08wip,
+ address = {San Jose, CA},
+ author = {Tim Kaldewey and Andrew Shewmaker and Richard Golding and Carlos Maltzahn and Theodore Wong and Scott A. Brandt},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAYSy9rYWxkZXdleS1mYXN0MDh3aXAucGRmTxEBeAAAAAABeAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////FmthbGRld2V5LWZhc3QwOHdpcC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFLAAACAD4vOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOks6a2FsZGV3ZXktZmFzdDA4d2lwLnBkZgAOAC4AFgBrAGEAbABkAGUAdwBlAHkALQBmAGEAcwB0ADAAOAB3AGkAcAAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAKS9NeSBEcml2ZS9QYXBlcnMvSy9rYWxkZXdleS1mYXN0MDh3aXAucGRmAAATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA/AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbs=},
+ booktitle = {Work in Progress at 6th USENIX Conference on File and Storage Technologies (FAST '08)},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2019-12-29 16:42:04 -0800},
+ keywords = {shortpapers, qos, networking, storage},
+ month = {February 26-29},
+ title = {RADoN: QoS in Storage Networks},
+ year = {2008}
+}
+
diff --git a/content/publication/kaldewey-fast-08-wip/index.md b/content/publication/kaldewey-fast-08-wip/index.md
new file mode 100644
index 00000000000..453f76a11e8
--- /dev/null
+++ b/content/publication/kaldewey-fast-08-wip/index.md
@@ -0,0 +1,12 @@
+---
+title: "RADoN: QoS in Storage Networks"
+date: 2008-02-01
+publishDate: 2020-01-05T06:43:50.720771Z
+authors: ["Tim Kaldewey", "Andrew Shewmaker", "Richard Golding", "Carlos Maltzahn", "Theodore Wong", "Scott A. Brandt"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*Work in Progress at 6th USENIX Conference on File and Storage Technologies (FAST '08)*"
+tags: ["shortpapers", "qos", "networking", "storage"]
+---
+
diff --git a/content/publication/kaldewey-rtas-08/cite.bib b/content/publication/kaldewey-rtas-08/cite.bib
new file mode 100644
index 00000000000..e62d3cc0e23
--- /dev/null
+++ b/content/publication/kaldewey-rtas-08/cite.bib
@@ -0,0 +1,15 @@
+@inproceedings{kaldewey:rtas08,
+ abstract = {Large- and small-scale storage systems frequently serve a mixture of workloads, an increasing number of which require some form of performance guarantee. Providing guaranteed disk performance---the equivalent of a ``virtual disk''---is challenging because disk requests are non-preemptible and their execution times are stateful, partially non-deterministic, and can vary by orders of magnitude. Guaranteeing throughput, the standard measure of disk performance, requires worst-case I/O time assumptions orders of magnitude greater than average I/O times, with correspondingly low performance and poor control of the resource allocation. We show that disk time utilization--- analogous to CPU utilization in CPU scheduling and the only fully provisionable aspect of disk performance---yields greater control, more efficient use of disk resources, and better isolation between request streams than bandwidth or I/O rate when used as the basis for disk reservation and scheduling.},
+ address = {St. Louis, Missouri},
+ annote = {Springer Journal of Real-Time Systems Award for Best Student Paper},
+ author = {Tim Kaldewey and Anna Povzner and Theodore Wong and Richard Golding and Scott A. Brandt and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAVSy9rYWxkZXdleS1ydGFzMDgucGRmTxEBbAAAAAABbAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////E2thbGRld2V5LXJ0YXMwOC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFLAAACADsvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOks6a2FsZGV3ZXktcnRhczA4LnBkZgAADgAoABMAawBhAGwAZABlAHcAZQB5AC0AcgB0AGEAcwAwADgALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACYvTXkgRHJpdmUvUGFwZXJzL0sva2FsZGV3ZXktcnRhczA4LnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA8AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAaw=},
+ booktitle = {RTAS 2008},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:27:49 -0700},
+ keywords = {papers, performance, management, storage, systems, fahrrad, rbed, qos},
+ month = {April},
+ title = {Virtualizing Disk Performance},
+ year = {2008}
+}
+
diff --git a/content/publication/kaldewey-rtas-08/index.md b/content/publication/kaldewey-rtas-08/index.md
new file mode 100644
index 00000000000..e5eff5212c3
--- /dev/null
+++ b/content/publication/kaldewey-rtas-08/index.md
@@ -0,0 +1,12 @@
+---
+title: "Virtualizing Disk Performance"
+date: 2008-04-01
+publishDate: 2020-01-05T13:33:06.023046Z
+authors: ["Tim Kaldewey", "Anna Povzner", "Theodore Wong", "Richard Golding", "Scott A. Brandt", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: "Large- and small-scale storage systems frequently serve a mixture of workloads, an increasing number of which require some form of performance guarantee. Providing guaranteed disk performance---the equivalent of a ``virtual disk''---is challenging because disk requests are non-preemptible and their execution times are stateful, partially non-deterministic, and can vary by orders of magnitude. Guaranteeing throughput, the standard measure of disk performance, requires worst-case I/O time assumptions orders of magnitude greater than average I/O times, with correspondingly low performance and poor control of the resource allocation. We show that disk time utilization--- analogous to CPU utilization in CPU scheduling and the only fully provisionable aspect of disk performance---yields greater control, more efficient use of disk resources, and better isolation between request streams than bandwidth or I/O rate when used as the basis for disk reservation and scheduling."
+featured: false
+publication: "*RTAS 2008 (**Best Student Paper**)*"
+tags: ["papers", "performance", "management", "storage", "systems", "fahrrad", "rbed", "qos"]
+---
+
diff --git a/content/publication/kato-usenix-12/cite.bib b/content/publication/kato-usenix-12/cite.bib
new file mode 100644
index 00000000000..fa6a13b092f
--- /dev/null
+++ b/content/publication/kato-usenix-12/cite.bib
@@ -0,0 +1,15 @@
+@inproceedings{kato:usenix12,
+ abstract = {Graphics processing units (GPUs) have become a very powerful platform embracing a concept of heterogeneous many-core computing. However, application domains of GPUs are currently limited to specific systems, largely due to a lack of ``first-class'' GPU resource management for general-purpose multi-tasking systems.
+We present Gdev, a new ecosystem of GPU resource management in the operating system (OS). It allows the user space as well as the OS itself to use GPUs as first-class computing resources. Specifically, Gdev's virtual memory manager supports data swapping for excessive memory resource demands, and also provides a shared device memory functionality that allows GPU contexts to communicate with other contexts. Gdev further provides a GPU scheduling scheme to virtualize a physical GPU into multiple logical GPUs, enhancing isolation among working sets of multi-tasking systems.
+Our evaluation conducted on Linux and the NVIDIA GPU shows that the basic performance of our prototype implementation is reliable even compared to proprietary software. Further detailed experiments demonstrate that Gdev achieves a 2x speedup for an encrypted file system using the GPU in the OS. Gdev can also improve the makespan of dataflow programs by up to 49% exploiting shared device memory, while an error in the utilization of virtualized GPUs can be limited within only 7%.},
+ address = {Boston, MA},
+ author = {Shinpei Kato and Michael McThrow and Carlos Maltzahn and Scott A. Brandt},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxATSy9rYXRvLXVzZW5peDEyLnBkZk8RAWQAAAAAAWQAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xFrYXRvLXVzZW5peDEyLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABSwAAAgA5LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpLOmthdG8tdXNlbml4MTIucGRmAAAOACQAEQBrAGEAdABvAC0AdQBzAGUAbgBpAHgAMQAyAC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAkL015IERyaXZlL1BhcGVycy9LL2thdG8tdXNlbml4MTIucGRmABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkADoAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABog==},
+ booktitle = {USENIX ATC '12},
+ date-added = {2012-04-06 22:55:09 +0000},
+ date-modified = {2020-01-05 05:30:40 -0700},
+ keywords = {papers, gpgpu, kernel, linux, scheduling},
+ title = {Gdev: First-Class GPU Resource Management in the Operating System},
+ year = {2012}
+}
+
diff --git a/content/publication/kato-usenix-12/index.md b/content/publication/kato-usenix-12/index.md
new file mode 100644
index 00000000000..89f43e7657c
--- /dev/null
+++ b/content/publication/kato-usenix-12/index.md
@@ -0,0 +1,15 @@
+---
+title: "Gdev: First-Class GPU Resource Management in the Operating System"
+date: 2012-01-01
+publishDate: 2020-01-05T13:33:05.973549Z
+authors: ["Shinpei Kato", "Michael McThrow", "Carlos Maltzahn", "Scott A. Brandt"]
+publication_types: ["1"]
+abstract: "Graphics processing units (GPUs) have become a very powerful platform embracing a concept of heterogeneous many-core computing. However, application domains of GPUs are currently limited to specific systems, largely due to a lack of ``first-class'' GPU resource management for general-purpose multi-tasking systems. We present Gdev, a new ecosystem of GPU resource management in the operating system (OS). It allows the user space as well as the OS itself to use GPUs as first-class computing resources. Specifically, Gdev's virtual memory manager supports data swapping for excessive memory resource demands, and also provides a shared device memory functionality that allows GPU contexts to communicate with other contexts. Gdev further provides a GPU scheduling scheme to virtualize a physical GPU into multiple logical GPUs, enhancing isolation among working sets of multi-tasking systems. Our evaluation conducted on Linux and the NVIDIA GPU shows that the basic performance of our prototype implementation is reliable even compared to proprietary software. Further detailed experiments demonstrate that Gdev achieves a 2x speedup for an encrypted file system using the GPU in the OS. Gdev can also improve the makespan of dataflow programs by up to 49% exploiting shared device memory, while an error in the utilization of virtualized GPUs can be limited within only 7%."
+featured: false
+publication: "*USENIX ATC '12*"
+url_slides: "https://www.usenix.org/sites/default/files/conference/protected-files/kato_atc12_slides.pdf"
+url_video: "https://www.usenix.org/conference/usenixfederatedconferencesweek/gdev-first-class-gpu-resource-management-operating-system"
+url_audio: "https://www.usenix.org/conference/usenixfederatedconferencesweek/gdev-first-class-gpu-resource-management-operating-system"
+tags: ["papers", "gpgpu", "kernel", "linux", "scheduling"]
+---
+
diff --git a/content/publication/klasky-jp-16/cite.bib b/content/publication/klasky-jp-16/cite.bib
new file mode 100644
index 00000000000..45ba7dd8c37
--- /dev/null
+++ b/content/publication/klasky-jp-16/cite.bib
@@ -0,0 +1,16 @@
+@article{klasky:jp16,
+ abstract = {As the exascale computing age emerges, data related issues are becoming critical factors that determine how and where we do computing. Popular approaches used by traditional I/O solution and storage libraries become increasingly bottlenecked due to their assumptions about data movement, re-organization, and storage. While, new technologies, such as ``burst buffers'', can help address some of the short-term performance issues, it is essential that we reexamine the underlying storage and I/O infrastructure to effectively support requirements and challenges at exascale and beyond. In this paper we present a new approach to the exascale Storage System and I/O (SSIO), which is based on allowing users to inject application knowledge into the system and leverage this knowledge to better manage, store, and access large data volumes so as to minimize the time to scientific insights. Central to our approach is the distinction between the data, metadata, and the knowledge contained therein, transferred from the user to the system by describing ``utility'' of data as it ages.},
+ author = {Scott A. Klasky and Hasan Abbasi and Mark Ainsworth and J. Choi and Matthew Curry and T. Kurc and Qing Liu and Jay Lofstead and Carlos Maltzahn and Manish Parashar and Norbert Podhorszki and Eric Suchyta and Fang Wang and Matthew Wolf and C. S. Chang and M. Churchill and S. Ethier},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxARSy9rbGFza3ktanAxNi5wZGZPEQFcAAAAAAFcAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8Pa2xhc2t5LWpwMTYucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAUsAAAIANy86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6SzprbGFza3ktanAxNi5wZGYAAA4AIAAPAGsAbABhAHMAawB5AC0AagBwADEANgAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAIi9NeSBEcml2ZS9QYXBlcnMvSy9rbGFza3ktanAxNi5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAOAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGY},
+ date-added = {2017-01-14 20:46:38 +0000},
+ date-modified = {2020-01-04 21:45:50 -0700},
+ journal = {J. Phys.: Conf. Ser.},
+ keywords = {papers, storage, exascale, systems, hpc},
+ month = {November 11},
+ number = {1},
+ pages = {012095},
+ title = {Exascale Storage Systems the SIRIUS Way},
+ volume = {759},
+ year = {2016}
+}
+
diff --git a/content/publication/klasky-jp-16/index.md b/content/publication/klasky-jp-16/index.md
new file mode 100644
index 00000000000..ec7cfef8634
--- /dev/null
+++ b/content/publication/klasky-jp-16/index.md
@@ -0,0 +1,12 @@
+---
+title: "Exascale Storage Systems the SIRIUS Way"
+date: 2016-11-01
+publishDate: 2020-01-05T06:43:50.451991Z
+authors: ["Scott A. Klasky", "Hasan Abbasi", "Mark Ainsworth", "J. Choi", "Matthew Curry", "T. Kurc", "Qing Liu", "Jay Lofstead", "Carlos Maltzahn", "Manish Parashar", "Norbert Podhorszki", "Eric Suchyta", "Fang Wang", "Matthew Wolf", "C. S. Chang", "M. Churchill", "S. Ethier"]
+publication_types: ["2"]
+abstract: "As the exascale computing age emerges, data related issues are becoming critical factors that determine how and where we do computing. Popular approaches used by traditional I/O solution and storage libraries become increasingly bottlenecked due to their assumptions about data movement, re-organization, and storage. While, new technologies, such as ``burst buffers'', can help address some of the short-term performance issues, it is essential that we reexamine the underlying storage and I/O infrastructure to effectively support requirements and challenges at exascale and beyond. In this paper we present a new approach to the exascale Storage System and I/O (SSIO), which is based on allowing users to inject application knowledge into the system and leverage this knowledge to better manage, store, and access large data volumes so as to minimize the time to scientific insights. Central to our approach is the distinction between the data, metadata, and the knowledge contained therein, transferred from the user to the system by describing ``utility'' of data as it ages."
+featured: false
+publication: "*J. Phys.: Conf. Ser.*"
+tags: ["papers", "storage", "exascale", "systems", "hpc"]
+---
+
diff --git a/content/publication/koren-pdsw-07/cite.bib b/content/publication/koren-pdsw-07/cite.bib
new file mode 100644
index 00000000000..29ea188b3d4
--- /dev/null
+++ b/content/publication/koren-pdsw-07/cite.bib
@@ -0,0 +1,15 @@
+@inproceedings{koren:pdsw07,
+ abstract = {As users interact with file systems of ever increasing size, it is becoming more difficult for them to familiarize themselves with the entire contents of the file system. In petabyte-scale systems, users must navigate a pool of billions of shared files in order to find the information they are looking for. One way to help alleviate this problem is to integrate navigation and search into a common framework.
+One such method is faceted search. This method originated within the information retrieval community, and has proved popular for navigating large repositories, such as those in e-commerce sites and digital libraries. This paper introduces faceted search and outlines several current research directions in adapting faceted search techniques to petabyte-scale file systems.},
+ address = {Reno, NV},
+ author = {Jonathan Koren and Yi Zhang and Sasha Ames and Andrew Leung and Carlos Maltzahn and Ethan L. Miller},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxASSy9rb3Jlbi1wZHN3MDcucGRmTxEBYAAAAAABYAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////EGtvcmVuLXBkc3cwNy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFLAAACADgvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOks6a29yZW4tcGRzdzA3LnBkZgAOACIAEABrAG8AcgBlAG4ALQBwAGQAcwB3ADAANwAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAIy9NeSBEcml2ZS9QYXBlcnMvSy9rb3Jlbi1wZHN3MDcucGRmAAATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA5AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAZ0=},
+ booktitle = {Proceedings of the 2007 ACM Petascale Data Storage Workshop (PDSW 07)},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:24:17 -0700},
+ keywords = {papers, ir, filesystems, metadata, facets, search},
+ month = {November},
+ title = {Searching and Navigating Petabyte Scale File Systems Based on Facets},
+ year = {2007}
+}
+
diff --git a/content/publication/koren-pdsw-07/index.md b/content/publication/koren-pdsw-07/index.md
new file mode 100644
index 00000000000..b6eab59fb79
--- /dev/null
+++ b/content/publication/koren-pdsw-07/index.md
@@ -0,0 +1,14 @@
+---
+title: "Searching and Navigating Petabyte Scale File Systems Based on Facets"
+date: 2007-11-01
+publishDate: 2020-01-05T13:33:06.019209Z
+authors: ["Jonathan Koren", "Yi Zhang", "Sasha Ames", "Andrew Leung", "Carlos Maltzahn", "Ethan L. Miller"]
+publication_types: ["1"]
+abstract: "As users interact with file systems of ever increasing size, it is becoming more difficult for them to familiarize themselves with the entire contents of the file system. In petabyte-scale systems, users must navigate a pool of billions of shared files in order to find the information they are looking for. One way to help alleviate this problem is to integrate navigation and search into a common framework. One such method is faceted search. This method originated within the information retrieval community, and has proved popular for navigating large repositories, such as those in e-commerce sites and digital libraries. This paper introduces faceted search and outlines several current research directions in adapting faceted search techniques to petabyte-scale file systems."
+featured: false
+publication: "*Proceedings of the 2007 ACM Petascale Data Storage Workshop (PDSW 07)*"
+tags: ["papers", "ir", "filesystems", "metadata", "facets", "search"]
+projects:
+- metadata-rich
+---
+
diff --git a/content/publication/kroeger-unpublished-96/cite.bib b/content/publication/kroeger-unpublished-96/cite.bib
new file mode 100644
index 00000000000..21a5647afba
--- /dev/null
+++ b/content/publication/kroeger-unpublished-96/cite.bib
@@ -0,0 +1,10 @@
+@unpublished{kroeger:unpublished96,
+ author = {Thomas M. Kroeger and Jeff Mogul and Carlos Maltzahn},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2009-12-14 11:55:00 -0800},
+ local-url = {/Users/carlosmalt/Documents/Papers/kroeger-unpublished96.pdf},
+ note = {ftp://ftp.digital.com/pub/DEC/traces/proxy/webtraces.v1.2.html},
+ title = {Digital's Web Proxy Traces},
+ year = {1996}
+}
+
diff --git a/content/publication/kroeger-unpublished-96/index.md b/content/publication/kroeger-unpublished-96/index.md
new file mode 100644
index 00000000000..8634789e0e3
--- /dev/null
+++ b/content/publication/kroeger-unpublished-96/index.md
@@ -0,0 +1,11 @@
+---
+title: "Digital's Web Proxy Traces"
+date: 1996-01-01
+publishDate: 2020-01-05T06:43:50.687259Z
+authors: ["Thomas M. Kroeger", "Jeff Mogul", "Carlos Maltzahn"]
+publication_types: ["3"]
+abstract: ""
+featured: false
+publication: ""
+---
+
diff --git a/content/publication/kufeldt-fast-18-wip/cite.bib b/content/publication/kufeldt-fast-18-wip/cite.bib
new file mode 100644
index 00000000000..8d25b58dd0a
--- /dev/null
+++ b/content/publication/kufeldt-fast-18-wip/cite.bib
@@ -0,0 +1,12 @@
+@inproceedings{kufeldt:fast18wip,
+ address = {Oakland, CA},
+ author = {Philip Kufeldt and Timothy Feldman and Christine Green and Grant Mackey and Carlos Maltzahn and Shingo Tanaka},
+ booktitle = {WiP and Poster Sessions at 16th USENIX Conference on File and Storage Technologies (FAST'18)},
+ date-added = {2019-12-26 19:17:05 -0800},
+ date-modified = {2019-12-29 16:35:11 -0800},
+ keywords = {shortpapers, eusocial, embedded, storage},
+ month = {Feb 12-15},
+ title = {Eusocial Storage Devices},
+ year = {2018}
+}
+
diff --git a/content/publication/kufeldt-fast-18-wip/index.md b/content/publication/kufeldt-fast-18-wip/index.md
new file mode 100644
index 00000000000..8918f115509
--- /dev/null
+++ b/content/publication/kufeldt-fast-18-wip/index.md
@@ -0,0 +1,14 @@
+---
+title: "Eusocial Storage Devices"
+date: 2018-02-01
+publishDate: 2020-01-05T06:43:50.387236Z
+authors: ["Philip Kufeldt", "Timothy Feldman", "Christine Green", "Grant Mackey", "Carlos Maltzahn", "Shingo Tanaka"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*WiP and Poster Sessions at 16th USENIX Conference on File and Storage Technologies (FAST'18)*"
+tags: ["shortpapers", "eusocial", "embedded", "storage"]
+projects:
+- eusocial-storage
+---
+
diff --git a/content/publication/kufeldt-fast-19-poster/cite.bib b/content/publication/kufeldt-fast-19-poster/cite.bib
new file mode 100644
index 00000000000..c935505bccd
--- /dev/null
+++ b/content/publication/kufeldt-fast-19-poster/cite.bib
@@ -0,0 +1,12 @@
+@inproceedings{kufeldt:fast19poster,
+ address = {Boston, MA},
+ author = {Philip Kufeldt and Jianshen Liu and Carlos Maltzahn},
+ booktitle = {Poster Session at 17th USENIX Conference on File and Storage Technologies (FAST'19)},
+ date-added = {2019-12-26 19:07:25 -0800},
+ date-modified = {2019-12-29 16:35:40 -0800},
+ keywords = {shortpapers, reproducibility, embedded, storage, eusocial},
+ month = {Februrary 25-28},
+ title = {MBWU (MibeeWu): Quantifying benefits of offloading data management to storage devices},
+ year = {2019}
+}
+
diff --git a/content/publication/kufeldt-fast-19-poster/index.md b/content/publication/kufeldt-fast-19-poster/index.md
new file mode 100644
index 00000000000..ebd7e73eab2
--- /dev/null
+++ b/content/publication/kufeldt-fast-19-poster/index.md
@@ -0,0 +1,12 @@
+---
+title: "MBWU (MibeeWu): Quantifying benefits of offloading data management to storage devices"
+date: 2019-02-01
+publishDate: 2020-01-05T06:43:50.390744Z
+authors: ["Philip Kufeldt", "Jianshen Liu", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*Poster Session at 17th USENIX Conference on File and Storage Technologies (FAST'19)*"
+tags: ["shortpapers", "reproducibility", "embedded", "storage", "eusocial"]
+---
+
diff --git a/content/publication/kufeldt-login-18/cite.bib b/content/publication/kufeldt-login-18/cite.bib
new file mode 100644
index 00000000000..eb2ddf6f6ec
--- /dev/null
+++ b/content/publication/kufeldt-login-18/cite.bib
@@ -0,0 +1,15 @@
+@article{kufeldt:login18,
+ abstract = {As storage devices get faster, data management tasks rob the host of CPU cycles and DDR bandwidth. In this article, we examine a new interface to storage devices that can leverage existing and new CPU and DRAM resources to take over data management tasks like availability, recovery, and migrations. This new interface provides a roadmap for device-to-device interactions and more powerful storage devices capable of providing in-store compute services that can dramatically improve performance. We call such storage devices ``eusocial'' because we are inspired by eusocial insects like ants, termites, and bees, which as individuals are primitive but collectively accomplish amazing things.},
+ author = {Philip Kufeldt and Carlos Maltzahn and Tim Feldman and Christine Green and Grant Mackey and Shingo Tanaka},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAVSy9rdWZlbGR0LWxvZ2luMTgucGRmTxEBbAAAAAABbAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////E2t1ZmVsZHQtbG9naW4xOC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFLAAACADsvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOks6a3VmZWxkdC1sb2dpbjE4LnBkZgAADgAoABMAawB1AGYAZQBsAGQAdAAtAGwAbwBnAGkAbgAxADgALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACYvTXkgRHJpdmUvUGFwZXJzL0sva3VmZWxkdC1sb2dpbjE4LnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA8AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAaw=},
+ date-added = {2018-06-06 16:06:14 +0000},
+ date-modified = {2020-01-04 21:33:08 -0700},
+ journal = {;login: The USENIX Magazine},
+ keywords = {papers, storage, devices, networking, flash, offloading},
+ number = {2},
+ pages = {16--22},
+ title = {Eusocial Storage Devices - Offloading Data Management to Storage Devices that Can Act Collectively},
+ volume = {43},
+ year = {2018}
+}
+
diff --git a/content/publication/kufeldt-login-18/index.md b/content/publication/kufeldt-login-18/index.md
new file mode 100644
index 00000000000..c39fd7ce984
--- /dev/null
+++ b/content/publication/kufeldt-login-18/index.md
@@ -0,0 +1,14 @@
+---
+title: "Eusocial Storage Devices - Offloading Data Management to Storage Devices that Can Act Collectively"
+date: 2018-01-01
+publishDate: 2020-01-05T06:43:50.433238Z
+authors: ["Philip Kufeldt", "Carlos Maltzahn", "Tim Feldman", "Christine Green", "Grant Mackey", "Shingo Tanaka"]
+publication_types: ["2"]
+abstract: "As storage devices get faster, data management tasks rob the host of CPU cycles and DDR bandwidth. In this article, we examine a new interface to storage devices that can leverage existing and new CPU and DRAM resources to take over data management tasks like availability, recovery, and migrations. This new interface provides a roadmap for device-to-device interactions and more powerful storage devices capable of providing in-store compute services that can dramatically improve performance. We call such storage devices ``eusocial'' because we are inspired by eusocial insects like ants, termites, and bees, which as individuals are primitive but collectively accomplish amazing things."
+featured: false
+publication: "*;login: The USENIX Magazine*"
+tags: ["papers", "storage", "devices", "networking", "flash", "offloading"]
+projects:
+- eusocial-storage
+---
+
diff --git a/content/publication/lefevre-login-20/cite.bib b/content/publication/lefevre-login-20/cite.bib
new file mode 100644
index 00000000000..263622f0836
--- /dev/null
+++ b/content/publication/lefevre-login-20/cite.bib
@@ -0,0 +1,13 @@
+@article{lefevre:login20,
+ author = {Jeff LeFevre and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxBILi4vLi4vLi4vLi4vLi4vVm9sdW1lcy9Hb29nbGVEcml2ZS9NeSBEcml2ZS9QYXBlcnMvTC9sZWZldnJlLWxvZ2luMjAucGRmTxEBbAAAAAABbAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////E2xlZmV2cmUtbG9naW4yMC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAD/////AAAKAGN1AAAAAAAAAAAAAAAAAAFMAAACADsvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOkw6bGVmZXZyZS1sb2dpbjIwLnBkZgAADgAoABMAbABlAGYAZQB2AHIAZQAtAGwAbwBnAGkAbgAyADAALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACYvTXkgRHJpdmUvUGFwZXJzL0wvbGVmZXZyZS1sb2dpbjIwLnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJABvAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAd8=},
+ date-added = {2020-06-12 18:36:51 -0700},
+ date-modified = {2020-06-12 18:36:51 -0700},
+ journal = {USENIX ;login:},
+ keywords = {papers, programmable, storage, ceph, physicaldesign},
+ number = {2},
+ title = {SkyhookDM: Data Processing in Ceph with Programmable Storage},
+ volume = {45},
+ year = {2020}
+}
+
diff --git a/content/publication/lefevre-login-20/index.md b/content/publication/lefevre-login-20/index.md
new file mode 100644
index 00000000000..bdbd45afadb
--- /dev/null
+++ b/content/publication/lefevre-login-20/index.md
@@ -0,0 +1,15 @@
+---
+title: "SkyhookDM: Data Processing in Ceph with Programmable Storage"
+date: 2020-06-12
+publishDate: 2020-06-13T01:39:15.018368Z
+authors: ["Jeff LeFevre", "Carlos Maltzahn"]
+publication_types: ["2"]
+abstract: ""
+featured: false
+publication: "*USENIX ;login:*"
+tags: ["papers", "programmable", "storage", "ceph", "physicaldesign"]
+projects:
+- programmable-storage
+- declstore
+- skyhook
+---
diff --git a/content/publication/lefevre-snia-20/cite.bib b/content/publication/lefevre-snia-20/cite.bib
new file mode 100644
index 00000000000..ba60fdd0c43
--- /dev/null
+++ b/content/publication/lefevre-snia-20/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{lefevre:snia20,
+ address = {Virtual},
+ author = {Jeff LeFevre and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAtLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0wvbGVmZXZyZS1zbmlhMjAucGRmTxEBagAAAAABagACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA368jeEJEAAH/////EmxlZmV2cmUtc25pYTIwLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////f5OqIAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFMAAACADcvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkw6bGVmZXZyZS1zbmlhMjAucGRmAAAOACYAEgBsAGUAZgBlAHYAcgBlAC0AcwBuAGkAYQAyADAALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADVVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9ML2xlZmV2cmUtc25pYTIwLnBkZgAAEwABLwAAFQACABH//wAAAAgADQAaACQAVAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHC},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA0Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0wvbGVmZXZyZS1zbmlhMjAtc2xpZGVzLnBkZk8RAYQAAAAAAYQAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAN+vI3hCRAAB/////xlsZWZldnJlLXNuaWEyMC1zbGlkZXMucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////3+TrUwAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABTAAAAgA+LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpMOmxlZmV2cmUtc25pYTIwLXNsaWRlcy5wZGYADgA0ABkAbABlAGYAZQB2AHIAZQAtAHMAbgBpAGEAMgAwAC0AcwBsAGkAZABlAHMALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADxVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9ML2xlZmV2cmUtc25pYTIwLXNsaWRlcy5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQAWwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHj},
+ booktitle = {SNIA SDC 2020},
+ date-added = {2023-01-11 22:37:16 -0800},
+ date-modified = {2023-01-11 22:40:46 -0800},
+ keywords = {programmable, storage},
+ month = {September 23},
+ title = {SkyhookDM: Storage and Management of Tabular Data in Ceph},
+ year = {2020}
+}
+
diff --git a/content/publication/lefevre-snia-20/index.md b/content/publication/lefevre-snia-20/index.md
new file mode 100644
index 00000000000..7b84ed2980a
--- /dev/null
+++ b/content/publication/lefevre-snia-20/index.md
@@ -0,0 +1,12 @@
+---
+title: "SkyhookDM: Storage and Management of Tabular Data in Ceph"
+date: 2020-09-01
+publishDate: 2023-01-26T14:23:16.862545Z
+authors: ["Jeff LeFevre", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*SNIA SDC 2020*"
+tags: ["programmable", "storage"]
+---
+
diff --git a/content/publication/lefevre-vault-19/cite.bib b/content/publication/lefevre-vault-19/cite.bib
new file mode 100644
index 00000000000..cb4135f4e08
--- /dev/null
+++ b/content/publication/lefevre-vault-19/cite.bib
@@ -0,0 +1,13 @@
+@inproceedings{lefevre:vault19,
+ address = {Boston, MA},
+ author = {Jeff LeFevre and Noah Watkins and Michael Sevilla and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAcTC9sZWZldnJlLXZhdWx0MTktc2xpZGVzLnBkZk8RAYgAAAAAAYgAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xpsZWZldnJlLXZhdWx0MTktc2xpZGVzLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABTAAAAgBCLzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpMOmxlZmV2cmUtdmF1bHQxOS1zbGlkZXMucGRmAA4ANgAaAGwAZQBmAGUAdgByAGUALQB2AGEAdQBsAHQAMQA5AC0AcwBsAGkAZABlAHMALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASAC0vTXkgRHJpdmUvUGFwZXJzL0wvbGVmZXZyZS12YXVsdDE5LXNsaWRlcy5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAEMAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABzw==},
+ booktitle = {2019 Linux Storage and Filesystems (Vault'19, co-located with FAST'19)},
+ date-added = {2019-08-07 17:58:01 -0700},
+ date-modified = {2019-08-07 18:00:48 -0700},
+ keywords = {papers, programmable, storage, database},
+ month = {Februrary 25-26},
+ title = {Skyhook: Programmable storage for databases},
+ year = {2019}
+}
+
diff --git a/content/publication/lefevre-vault-19/index.md b/content/publication/lefevre-vault-19/index.md
new file mode 100644
index 00000000000..a8c2ad1fa71
--- /dev/null
+++ b/content/publication/lefevre-vault-19/index.md
@@ -0,0 +1,24 @@
+---
+title: "Skyhook: Programmable storage for databases"
+date: 2019-02-01
+publishDate: 2020-01-05T06:43:50.416084Z
+authors: ["Jeff LeFevre", "Noah Watkins", "Michael Sevilla", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: "Ceph is an open source distributed storage system that is object-based and massively scalable. Ceph provides developers with the capability to create data interfaces that can take advantage of local CPU and memory on the storage nodes (Ceph Object Storage Devices). These interfaces are powerful for application developers and can be created in C, C++, and Lua.
+
+Skyhook is an open source storage and database project in the Center for Research in Open Source Software at UC Santa Cruz. Skyhook uses these capabilities in Ceph to create specialized read/write interfaces that leverage IO and CPU within the storage layer toward database processing and management. Specifically, we develop methods to apply predicates locally as well as additional metadata and indexing capabilities using Ceph's internal indexing mechanism built on top of RocksDB.
+
+Skyhook's approach helps to enable scale-out of a single node database system by scaling out the storage layer. Our results show the performance benefits for some queries indeed scale well as the storage layer scales out."
+featured: false
+publication: "*2019 Linux Storage and Filesystems (Vault'19, co-located with FAST'19)*"
+tags: ["papers", "programmable", "storage", "database"]
+projects:
+- programmable-storage
+- declstore
+- eusocial-storage
+- skyhook
+links:
+- name: Abstract
+ url: https://www.usenix.org/conference/vault19/presentation/lefevre
+url_video: https://www.youtube.com/watch?v=D8ByGa1-_E8
+---
diff --git a/content/publication/lefevre-vault-20/cite.bib b/content/publication/lefevre-vault-20/cite.bib
new file mode 100644
index 00000000000..eaf92e91794
--- /dev/null
+++ b/content/publication/lefevre-vault-20/cite.bib
@@ -0,0 +1,12 @@
+@inproceedings{lefevre:vault20,
+ address = {Santa Clara, CA},
+ author = {Jeff LeFevre and Carlos Maltzahn},
+ booktitle = {2020 Linux Storage and Filesystems Conference (Vault'20, co-located with FAST'20 and NSDI'20)},
+ date-added = {2019-12-26 19:04:52 -0800},
+ date-modified = {2019-12-29 16:36:00 -0800},
+ keywords = {shortpapers, programmable, storage, physicaldesign},
+ month = {February 24-25},
+ title = {Scaling databases and file APIs with programmable Ceph object storage},
+ year = {2020}
+}
+
diff --git a/content/publication/lefevre-vault-20/index.md b/content/publication/lefevre-vault-20/index.md
new file mode 100644
index 00000000000..5c5cd122a89
--- /dev/null
+++ b/content/publication/lefevre-vault-20/index.md
@@ -0,0 +1,17 @@
+---
+title: "Scaling databases and file APIs with programmable Ceph object storage"
+date: 2020-02-01
+publishDate: 2020-01-05T06:43:50.391599Z
+authors: ["Jeff LeFevre", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+url_slides: https://drive.google.com/file/d/1Oh60aay2TQPzrzxVu_eL5REmAtxoseVk/view?usp=sharing
+publication: "*2020 Linux Storage and Filesystems Conference (Vault'20, co-located with FAST'20 and NSDI'20)*"
+tags: ["shortpapers", "programmable", "storage", "physicaldesign"]
+projects:
+- programmable-storage
+- declstore
+- eusocial-storage
+- skyhook
+---
diff --git a/content/publication/leung-msst-07/cite.bib b/content/publication/leung-msst-07/cite.bib
new file mode 100644
index 00000000000..8ff1f62293d
--- /dev/null
+++ b/content/publication/leung-msst-07/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{leung:msst07,
+ abstract = {Achieving performance, reliability, and scalability presents a unique set of challenges for large distributed storage. To identify problem areas, there must be a way for developers to have a comprehensive view of the entire storage system. That is, users must be able to understand both node specific behavior and complex relationships between nodes. We present a distributed file system profiling method that supports such analysis. Our approach is based on combining node-specific metrics into a single cohesive system image. This affords users two views of the storage system: a micro, per-node view, as well as, a macro, multi- node view, allowing both node-specific and complex inter- nodal problems to be debugged. We visualize the storage system by displaying nodes and intuitively animating their metrics and behavior allowing easy analysis of complex problems.},
+ address = {Santa Clara, CA},
+ author = {Andrew Leung and Eric Lalonde and Jacob Telleen and James Davis and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxASTC9sZXVuZy1tc3N0MDcucGRmTxEBYAAAAAABYAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////EGxldW5nLW1zc3QwNy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFMAAACADgvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOkw6bGV1bmctbXNzdDA3LnBkZgAOACIAEABsAGUAdQBuAGcALQBtAHMAcwB0ADAANwAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAIy9NeSBEcml2ZS9QYXBlcnMvTC9sZXVuZy1tc3N0MDcucGRmAAATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA5AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAZ0=},
+ booktitle = {Proceedings of the 24th IEEE Conference on Mass Storage Systems and Technologies (MSST 2007)},
+ date-added = {2019-12-26 18:07:11 -0800},
+ date-modified = {2020-01-04 21:16:58 -0700},
+ keywords = {papers, performance, debuggung, distributed, storage, systems},
+ month = {September},
+ title = {Using Comprehensive Analysis for Performance Debugging in Distributed Storage Systems},
+ year = {2007}
+}
+
diff --git a/content/publication/leung-msst-07/index.md b/content/publication/leung-msst-07/index.md
new file mode 100644
index 00000000000..3aa668e20e0
--- /dev/null
+++ b/content/publication/leung-msst-07/index.md
@@ -0,0 +1,12 @@
+---
+title: "Using Comprehensive Analysis for Performance Debugging in Distributed Storage Systems"
+date: 2007-09-01
+publishDate: 2020-01-05T06:43:50.399107Z
+authors: ["Andrew Leung", "Eric Lalonde", "Jacob Telleen", "James Davis", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: "Achieving performance, reliability, and scalability presents a unique set of challenges for large distributed storage. To identify problem areas, there must be a way for developers to have a comprehensive view of the entire storage system. That is, users must be able to understand both node specific behavior and complex relationships between nodes. We present a distributed file system profiling method that supports such analysis. Our approach is based on combining node-specific metrics into a single cohesive system image. This affords users two views of the storage system: a micro, per-node view, as well as, a macro, multi- node view, allowing both node-specific and complex inter- nodal problems to be debugged. We visualize the storage system by displaying nodes and intuitively animating their metrics and behavior allowing easy analysis of complex problems."
+featured: false
+publication: "*Proceedings of the 24th IEEE Conference on Mass Storage Systems and Technologies (MSST 2007)*"
+tags: ["papers", "performance", "debuggung", "distributed", "storage", "systems"]
+---
+
diff --git a/content/publication/lieggi-rhrq-22/cite.bib b/content/publication/lieggi-rhrq-22/cite.bib
new file mode 100644
index 00000000000..3bd1b7bad3b
--- /dev/null
+++ b/content/publication/lieggi-rhrq-22/cite.bib
@@ -0,0 +1,15 @@
+@article{lieggi:rhrq22,
+ author = {Stephanie Lieggi},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAsLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0wvbGllZ2dpLXJocnEyMi5wZGZPEQFkAAAAAAFkAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8RbGllZ2dpLXJocnEyMi5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUwAAAIANi86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6TDpsaWVnZ2ktcmhycTIyLnBkZgAOACQAEQBsAGkAZQBnAGcAaQAtAHIAaAByAHEAMgAyAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA0VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvTC9saWVnZ2ktcmhycTIyLnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABTAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbs=},
+ date-added = {2022-05-10 16:11:16 -0700},
+ date-modified = {2022-05-10 16:11:48 -0700},
+ journal = {Red Hat Research Quarterly},
+ keywords = {oss, ospo, academia},
+ month = {February},
+ number = {4},
+ pages = {5--6},
+ title = {Building a university OSPO: Bolstering academic research through open source},
+ volume = {3},
+ year = {2022}
+}
+
diff --git a/content/publication/lieggi-rhrq-22/index.md b/content/publication/lieggi-rhrq-22/index.md
new file mode 100644
index 00000000000..8bc00f32c8f
--- /dev/null
+++ b/content/publication/lieggi-rhrq-22/index.md
@@ -0,0 +1,40 @@
+---
+# Documentation: https://wowchemy.com/docs/managing-content/
+
+title: 'Building a university OSPO: Bolstering academic research through open source'
+subtitle: ''
+summary: ''
+authors:
+- Stephanie Lieggi
+tags:
+- oss
+- ospo
+- academia
+categories: []
+date: '2022-02-01'
+lastmod: 2022-05-10T16:16:59-07:00
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Focal points: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight.
+image:
+ caption: ''
+ focal_point: ''
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects:
+- cross
+- ospo
+publishDate: '2022-05-10T23:16:58.803719Z'
+publication_types:
+- '2'
+abstract: ''
+publication: '*Red Hat Research Quarterly*'
+---
diff --git a/content/publication/lieggi-rse-hpc-20/cite.bib b/content/publication/lieggi-rse-hpc-20/cite.bib
new file mode 100644
index 00000000000..e58075810f7
--- /dev/null
+++ b/content/publication/lieggi-rse-hpc-20/cite.bib
@@ -0,0 +1,13 @@
+@inproceedings{lieggi:rse-hpc20,
+ author = {Stephanie Lieggi and Ivo Jimenez and Jeff LeFevre and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxBJLi4vLi4vLi4vLi4vLi4vVm9sdW1lcy9Hb29nbGVEcml2ZS9NeSBEcml2ZS9QYXBlcnMvTC9saWVnZ2ktcnNlLWhwYzIwLnBkZk8RAXAAAAAAAXAAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xRsaWVnZ2ktcnNlLWhwYzIwLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAA/////wAACgBjdQAAAAAAAAAAAAAAAAABTAAAAgA8LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpMOmxpZWdnaS1yc2UtaHBjMjAucGRmAA4AKgAUAGwAaQBlAGcAZwBpAC0AcgBzAGUALQBoAHAAYwAyADAALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACcvTXkgRHJpdmUvUGFwZXJzL0wvbGllZ2dpLXJzZS1ocGMyMC5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAHAAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAAB5A==},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxBQLi4vLi4vLi4vLi4vLi4vVm9sdW1lcy9Hb29nbGVEcml2ZS9NeSBEcml2ZS9QYXBlcnMvTC9saWVnZ2ktcnNlLWhwYzIwLXNsaWRlcy5wZGZPEQGMAAAAAAGMAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8bbGllZ2dpLXJzZS1ocGMyMC1zbGlkZXMucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAP////8AAAoAY3UAAAAAAAAAAAAAAAAAAUwAAAIAQy86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6TDpsaWVnZ2ktcnNlLWhwYzIwLXNsaWRlcy5wZGYAAA4AOAAbAGwAaQBlAGcAZwBpAC0AcgBzAGUALQBoAHAAYwAyADAALQBzAGwAaQBkAGUAcwAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIALi9NeSBEcml2ZS9QYXBlcnMvTC9saWVnZ2ktcnNlLWhwYzIwLXNsaWRlcy5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAdwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAIH},
+ booktitle = {RSE-HPC -- Introduction: Research Software Engineers in HPC: Creating Community, Building Careers, Addressing Challenges, co-located with SC20},
+ date-added = {2020-11-30 12:29:24 -0800},
+ date-modified = {2020-11-30 12:31:45 -0800},
+ keywords = {papers, softwareengineering, oss, cross},
+ month = {November 12},
+ title = {The CROSS Incubator: A Case Study for funding and training RSEs},
+ year = {2020}
+}
+
diff --git a/content/publication/lieggi-rse-hpc-20/index.md b/content/publication/lieggi-rse-hpc-20/index.md
new file mode 100644
index 00000000000..02e8c49fad1
--- /dev/null
+++ b/content/publication/lieggi-rse-hpc-20/index.md
@@ -0,0 +1,14 @@
+---
+title: "The CROSS Incubator: A Case Study for funding and training RSEs"
+date: 2020-11-01
+publishDate: 2020-12-09T04:47:36.218163Z
+authors: ["Stephanie Lieggi", "Ivo Jimenez", "Jeff LeFevre", "Carlos Maltzahn"]
+publication_types: ["1"]
+url_slides: https://drive.google.com/file/d/1F9Fa8kBgyMfDoT3jUxJPcDO2iV9eXS1b/view?usp=sharing
+abstract: "The incubator and research projects sponsored by the Center for Research in Open Source Software (CROSS, cross.ucsc.edu) at UC Santa Cruz have been very effective at promoting the professional and technical development of research software engineers. Carlos Maltzahn founded CROSS in 2015 with a generous gift of $2,000,000 from UC Santa Cruz alumnus Dr. Sage Weil [6] and founding memberships of Toshiba America Electronic Components, SK Hynix Memory Solutions, and Micron Technology. Over the past five years, CROSS funding has enabled PhD students to not only create research software projects but also learn how to draw in new contributors and leverage established open source software communities. This position paper will present CROSS fellowships as case studies for how university-led open source projects can create a real-world, reproducible model for effectively training, funding and supporting research software engineers."
+featured: false
+publication: "*RSE-HPC -- Introduction: Research Software Engineers in HPC: Creating Community, Building Careers, Addressing Challenges, co-located with SC20*"
+tags: ["papers", "softwareengineering", "oss", "cross"]
+projects:
+ - cross
+---
diff --git a/content/publication/liu-arxiv-21/cite.bib b/content/publication/liu-arxiv-21/cite.bib
new file mode 100644
index 00000000000..7badfa71c73
--- /dev/null
+++ b/content/publication/liu-arxiv-21/cite.bib
@@ -0,0 +1,12 @@
+@unpublished{liu:arxiv21,
+ author = {Jianshen Liu and Carlos Maltzahn and Craig Ulmer and Matthew Leon Curry},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAqLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0wvbGl1LWFyeGl2MjEucGRmTxEBXAAAAAABXAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////D2xpdS1hcnhpdjIxLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFMAAACADQvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkw6bGl1LWFyeGl2MjEucGRmAA4AIAAPAGwAaQB1AC0AYQByAHgAaQB2ADIAMQAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAMlVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0wvbGl1LWFyeGl2MjEucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFEAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABsQ==},
+ date-added = {2021-07-23 11:37:49 -0700},
+ date-modified = {2021-07-23 11:41:43 -0700},
+ keywords = {papers, smartnics, performance},
+ month = {May 14},
+ note = {arxiv.org/abs/2105.06619 [cs.NI]},
+ title = {Performance Characteristics of the BlueField-2 SmartNIC},
+ year = {2021}
+}
+
diff --git a/content/publication/liu-arxiv-21/index.md b/content/publication/liu-arxiv-21/index.md
new file mode 100644
index 00000000000..756f821e7dd
--- /dev/null
+++ b/content/publication/liu-arxiv-21/index.md
@@ -0,0 +1,20 @@
+---
+title: "Performance Characteristics of the BlueField-2 SmartNIC"
+date: 2021-05-01
+publishDate: 2021-07-23T18:52:38.470006Z
+authors: ["Jianshen Liu", "Carlos Maltzahn", "Craig Ulmer", "Matthew Leon Curry"]
+publication_types: ["3"]
+abstract: "High-performance computing (HPC) researchers have long envisioned scenarios where application workflows could be improved through the use of programmable processing elements embedded in the network fabric. Recently, vendors have introduced programmable Smart Network Interface Cards (SmartNICs) that enable computations to be offloaded to the edge of the network. There is great interest in both the HPC and high-performance data analytics communities in understanding the roles these devices may play in the data paths of upcoming systems.
+
+This paper focuses on characterizing both the networking and computing aspects of NVIDIA's new BlueField-2 SmartNIC when used in an Ethernet environment. For the networking evaluation we conducted multiple transfer experiments between processors located at the host, the SmartNIC, and a remote host. These tests illuminate how much processing headroom is available on the SmartNIC during transfers. For the computing evaluation we used the stress-ng benchmark to compare the BlueField-2 to other servers and place realistic bounds on the types of offload operations that are appropriate for the hardware.
+
+Our findings from this work indicate that while the BlueField-2 provides a flexible means of processing data at the network's edge, great care must be taken to not overwhelm the hardware. While the host can easily saturate the network link, the SmartNIC's embedded processors may not have enough computing resources to sustain more than half the expected bandwidth when using kernel-space packet processing. From a computational perspective, encryption operations, memory operations under contention, and on-card IPC operations on the SmartNIC perform significantly better than the general-purpose servers used for comparisons in our experiments. Therefore, applications that mainly focus on these operations may be good candidates for offloading to the SmartNIC. "
+featured: false
+publication: "arXiv:2105.06619 [cs.NI]"
+tags: ["papers", "smartnics", "performance"]
+projects:
+- programmable-storage
+- eusocial-storage
+- declstore
+- smartnic
+---
diff --git a/content/publication/liu-hotedge-20/cite.bib b/content/publication/liu-hotedge-20/cite.bib
new file mode 100644
index 00000000000..20d2d7357b8
--- /dev/null
+++ b/content/publication/liu-hotedge-20/cite.bib
@@ -0,0 +1,12 @@
+@inproceedings{liu:hotedge20,
+ address = {Boston, MA},
+ author = {Jianshen Liu and Matthew Leon Curry and Carlos Maltzahn and Philip Kufeldt},
+ booktitle = {HotEdge'20},
+ date-added = {2020-04-19 12:38:42 -0700},
+ date-modified = {2020-04-19 12:38:42 -0700},
+ keywords = {papers, edge, reliability, disaggregation, embedded, failures},
+ month = {July 14},
+ title = {Scale-out Edge Storage Systems with Embedded Storage Nodes to Get Better Availability and Cost-Efficiency At the Same Time},
+ year = {2020}
+}
+
diff --git a/content/publication/liu-hotedge-20/index.md b/content/publication/liu-hotedge-20/index.md
new file mode 100644
index 00000000000..2c4cf4460f6
--- /dev/null
+++ b/content/publication/liu-hotedge-20/index.md
@@ -0,0 +1,14 @@
+---
+title: "Scale-out Edge Storage Systems with Embedded Storage Nodes to Get Better Availability and Cost-Efficiency At the Same Time"
+date: 2020-07-01
+publishDate: 2020-04-19T19:42:33.148866Z
+authors: ["Jianshen Liu", "Matthew Leon Curry", "Carlos Maltzahn", "Philip Kufeldt"]
+publication_types: ["1"]
+url_slides: https://www.usenix.org/sites/default/files/conference/protected-files/hotedge20-paper163-slides-liu-jianshen.pdf
+abstract: "In the resource-rich environment of data centers most failures can quickly failover to redundant resources. In contrast, failure in edge infrastructures with limited resources might require maintenance personnel to drive to the location in order to fix the problem. The operational cost of these 'truck rolls' to locations at the edge infrastructure competes with the operational cost incurred by extra space and power needed for redundant resources at the edge. Computational storage devices with network interfaces can act as network-attached storage servers and offer a new design point for storage systems at the edge. In this paper we hypothesize that a system consisting of a larger number of such small 'embedded' storage nodes provides higher availability due to a larger number of failure domains while also saving operational cost in terms of space and power. As evidence for our hypothesis, we compared the possibility of data loss between two different types of storage systems: one is constructed with general-purpose servers, and the other one is constructed with embedded storage nodes. Our results show that the storage system constructed with general-purpose servers has 7 to 20 times higher risk of losing data over the storage system constructed with embedded storage devices. We also compare the two alternatives in terms of power and space using the Media-Based Working Unit (MBWU) that we developed in an earlier paper as a reference point."
+featured: false
+publication: "*HotEdge'20*"
+tags: ["papers", "edge", "reliability", "disaggregation", "embedded", "failures"]
+projects:
+- eusocial-storage
+---
diff --git a/content/publication/liu-hpec-22/cite.bib b/content/publication/liu-hpec-22/cite.bib
new file mode 100644
index 00000000000..b045b85de13
--- /dev/null
+++ b/content/publication/liu-hpec-22/cite.bib
@@ -0,0 +1,13 @@
+@inproceedings{liu:hpec22,
+ address = {Virtual Event},
+ author = {Jianshen Liu and Carlos Maltzahn and Matthew L. Curry and Craig Ulmer},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxApLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0wvbGl1LWhwZWMyMi5wZGZPEQFaAAAAAAFaAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8ObGl1LWhwZWMyMi5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUwAAAIAMy86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6TDpsaXUtaHBlYzIyLnBkZgAADgAeAA4AbABpAHUALQBoAHAAZQBjADIAMgAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAMVVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0wvbGl1LWhwZWMyMi5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFAAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABrg==},
+ booktitle = {2022 IEEE High Performance Extreme Computing Conference (IEEE HPEC 2022)},
+ date-added = {2022-08-16 17:08:46 -0700},
+ date-modified = {2022-08-16 17:22:38 -0700},
+ keywords = {smartnics, offloading, datamanagement, hpc},
+ month = {September 19-23},
+ title = {Processing Particle Data Flows with SmartNICs},
+ year = {2022}
+}
+
diff --git a/content/publication/liu-hpec-22/index.md b/content/publication/liu-hpec-22/index.md
new file mode 100644
index 00000000000..82bba30c7dd
--- /dev/null
+++ b/content/publication/liu-hpec-22/index.md
@@ -0,0 +1,45 @@
+---
+# Documentation: https://wowchemy.com/docs/managing-content/
+
+title: Processing Particle Data Flows with SmartNICs (Outstanding Student Paper)
+subtitle: ''
+summary: ''
+authors:
+- Jianshen Liu
+- Carlos Maltzahn
+- Matthew L. Curry
+- Craig Ulmer
+tags:
+- smartnics
+- offloading
+- datamanagement
+- hpc
+categories: []
+date: '2022-09-01'
+lastmod: 2022-08-16T18:23:05-07:00
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Focal points: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight.
+image:
+ caption: ''
+ focal_point: ''
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects:
+ - eusocial-storage
+ - smartnic
+ - skyhook
+publishDate: '2022-08-17T01:23:04.962887Z'
+publication_types:
+- '1'
+abstract: "Many distributed applications implement complex data flows and need a flexible mechanism for routing data between producers and consumers. Recent advances in programmable network interface cards, or SmartNICs, represent an opportunity to offload data-flow tasks into the network fabric, thereby freeing the hosts to perform other work. System architects in this space face multiple questions about the best way to leverage SmartNICs as processing elements in data flows. In this paper, we advocate the use of Apache Arrow as a foundation to implement data flow tasks on SmartNICs. We report on our experience adapting a partitioning algorithm for particle data to Apache Arrow and measure the on-card processing performance for the BlueField-2 SmartNIC. Our experiments confirm that the BlueField-2's (de)compression hardware can have a significant impact on in-transit workflows where data must be unpacked, processed, and repacked."
+publication: '*2022 IEEE High Performance Extreme Computing Conference (IEEE HPEC 2022)*'
+---
diff --git a/content/publication/liu-hpec-23/cite.bib b/content/publication/liu-hpec-23/cite.bib
new file mode 100644
index 00000000000..fc4eacaefe4
--- /dev/null
+++ b/content/publication/liu-hpec-23/cite.bib
@@ -0,0 +1,13 @@
+@inproceedings{liu:hpec23,
+ abstract = {High-performance computing (HPC) systems researchers have proposed using current, programmable network interface cards (or SmartNICs) to offload data management services that would otherwise consume host processor cycles in a platform. While this work has successfully mapped data pipelines to a collection of SmartNICs, users require a flexible means of inspecting in-transit data to assess the live state of the system. In this paper, we explore SmartNIC-driven opportunistic query execution, i.e., enabling the SmartNIC to make a decision about whether to execute a query operation locally (i.e., ``offload'') or defer execution to the client (i.e., ``push-back''). Characterizations of different parts of the end-to-end query path allow the decision engine to make complexity predictions that would not be feasible by the client alone.},
+ address = {Virtual},
+ author = {Jianshen Liu and Carlos Maltzahn and Craig Ulmer},
+ booktitle = {HPEC '23},
+ date-added = {2023-08-29 19:45:03 -0700},
+ date-modified = {2023-08-29 19:56:34 -0700},
+ keywords = {papers, smartnics, querying, queryprocessing, streaming, streamprocessing, analysis},
+ month = {September 25-29},
+ title = {Opportunistic Query Execution on SmartNICs for Analyzing In-Transit Data},
+ year = {2023}
+}
+
diff --git a/content/publication/liu-hpec-23/index.md b/content/publication/liu-hpec-23/index.md
new file mode 100644
index 00000000000..1748aa58d7b
--- /dev/null
+++ b/content/publication/liu-hpec-23/index.md
@@ -0,0 +1,16 @@
+---
+title: "Opportunistic Query Execution on SmartNICs for Analyzing In-Transit Data"
+date: 2023-09-01
+publishDate: 2023-08-30T03:43:54.396498Z
+authors: ["Jianshen Liu", "Carlos Maltzahn", "Craig Ulmer"]
+publication_types: ["1"]
+abstract: High-performance computing (HPC) systems researchers have proposed using current, programmable network interface cards (or SmartNICs) to offload data management services that would otherwise consume host processor cycles in a platform. While this work has successfully mapped data pipelines to a collection of SmartNICs, users require a flexible means of inspecting in-transit data to assess the live state of the system. In this paper, we explore SmartNIC-driven opportunistic query execution, i.e., enabling the SmartNIC to make a decision about whether to execute a query operation locally (i.e., "offload") or defer execution to the client (i.e., "push-back"). Characterizations of different parts of the end-to-end query path allow the decision engine to make complexity predictions that would not be feasible by the client alone.
+featured: false
+publication: "*HPEC '23*"
+tags: ["papers", "smartnics", "querying", "queryprocessing", "streaming", "streamprocessing", "analysis"]
+projects:
+ - eusocial-storage
+ - smartnic
+ - skyhook
+---
+
diff --git a/content/publication/liu-iodc-19/cite.bib b/content/publication/liu-iodc-19/cite.bib
new file mode 100644
index 00000000000..1d2405d4a8e
--- /dev/null
+++ b/content/publication/liu-iodc-19/cite.bib
@@ -0,0 +1,15 @@
+@inproceedings{liu:iodc19,
+ abstract = {The storage industry is considering new kinds of storage de- vices that support data access function offloading, i.e. the ability to perform data access functions on the storage device itself as opposed to performing it on a separate compute system to which the storage device is connected. But what is the benefit of offloading to a storage device that is controlled by an embedded platform, very different from a host platform? To quantify the benefit, we need a measurement methodology that enables apple-to-apple comparisons between different platforms. We propose a Media-based Work Unit (MBWU, pronounced ''MibeeWu''), and an MBWU-based measurement methodology to standardize the platform efficiency evaluation so as to quantify the benefit of offloading. To demonstrate the merit of this methodology, we implemented a prototype to automate quantifying the benefit of offloading the key-value data access function.},
+ address = {Frankfurt a. M., Germany},
+ author = {Jianshen Liu and Philip Kufeldt and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAQTC9saXUtaW9kYzE5LnBkZk8RAVgAAAAAAVgAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////w5saXUtaW9kYzE5LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABTAAAAgA2LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpMOmxpdS1pb2RjMTkucGRmAA4AHgAOAGwAaQB1AC0AaQBvAGQAYwAxADkALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACEvTXkgRHJpdmUvUGFwZXJzL0wvbGl1LWlvZGMxOS5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkADcAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABkw==},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAXTC9saXUtaW9kYzE5LXNsaWRlcy5wZGZPEQF0AAAAAAF0AAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8VbGl1LWlvZGMxOS1zbGlkZXMucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAUwAAAIAPS86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6TDpsaXUtaW9kYzE5LXNsaWRlcy5wZGYAAA4ALAAVAGwAaQB1AC0AaQBvAGQAYwAxADkALQBzAGwAaQBkAGUAcwAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAKC9NeSBEcml2ZS9QYXBlcnMvTC9saXUtaW9kYzE5LXNsaWRlcy5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAPgAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAG2},
+ booktitle = {HPC I/O in the Data Center Workshop (HPC-IODC 2019, co-located with ISC-HPC 2019)},
+ date-added = {2019-12-26 15:40:05 -0800},
+ date-modified = {2020-01-04 21:23:18 -0700},
+ keywords = {papers, reproducibility, performance, embedded, storage, eusocial},
+ month = {June 20},
+ title = {MBWU: Benefit Quantification for Data Access Function Offloading},
+ year = {2019}
+}
+
diff --git a/content/publication/liu-iodc-19/index.md b/content/publication/liu-iodc-19/index.md
new file mode 100644
index 00000000000..ea487f90567
--- /dev/null
+++ b/content/publication/liu-iodc-19/index.md
@@ -0,0 +1,16 @@
+---
+title: "MBWU: Benefit Quantification for Data Access Function Offloading"
+date: 2019-06-01
+publishDate: 2020-01-05T06:43:50.409943Z
+authors: ["Jianshen Liu", "Philip Kufeldt", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: "The storage industry is considering new kinds of storage de- vices that support data access function offloading, i.e. the ability to perform data access functions on the storage device itself as opposed to performing it on a separate compute system to which the storage device is connected. But what is the benefit of offloading to a storage device that is controlled by an embedded platform, very different from a host platform? To quantify the benefit, we need a measurement methodology that enables apple-to-apple comparisons between different platforms. We propose a Media-based Work Unit (MBWU, pronounced ''MibeeWu''), and an MBWU-based measurement methodology to standardize the platform efficiency evaluation so as to quantify the benefit of offloading. To demonstrate the merit of this methodology, we implemented a prototype to automate quantifying the benefit of offloading the key-value data access function."
+featured: false
+publication: "*HPC I/O in the Data Center Workshop (HPC-IODC 2019, co-located with ISC-HPC 2019)*"
+url_slides: "https://hps.vi4io.org/_media/events/2019/hpc-iodc-mmbwu-maltzahn.pdf"
+tags: ["papers", "reproducibility", "performance", "embedded", "storage", "eusocial"]
+projects:
+- declstore
+- eusocial-storage
+---
+
diff --git a/content/publication/liu-msst-12/cite.bib b/content/publication/liu-msst-12/cite.bib
new file mode 100644
index 00000000000..8daa12dc9d5
--- /dev/null
+++ b/content/publication/liu-msst-12/cite.bib
@@ -0,0 +1,15 @@
+@inproceedings{liu:msst12,
+ abstract = {The largest-scale high-performance (HPC) systems are stretching parallel file systems to their limits in terms of aggregate bandwidth and numbers of clients. To further sustain the scalability of these file systems, researchers and HPC storage architects are exploring various storage system designs. One proposed storage system design integrates a tier of solid-state burst buffers into the storage system to absorb application I/O requests. In this paper, we simulate and explore this storage system design for use by large-scale HPC systems. First, we examine application I/O patterns on an existing large-scale HPC system to identify common burst patterns. Next, we describe enhancements to the CODES storage system simulator to enable our burst buffer simulations. These enhancements include the integration of a burst buffer model into the I/O forwarding layer of the simulator, the development of an I/O kernel description language and interpreter, the development of a suite of I/O kernels that are derived from observed I/O patterns, and fidelity improvements to the CODES models. We evaluate the I/O performance for a set of multiapplication I/O workloads and burst buffer configurations. We show that burst buffers can accelerate the application perceived throughput to the external storage system and can reduce the amount of external storage bandwidth required to meet a desired application perceived throughput goal.},
+ address = {Pacific Grove, CA},
+ author = {Ning Liu and Jason Cope and Philip Carns and Christopher Carothers and Robert Ross and Gary Grider and Adam Crume and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAQTC9saXUtbXNzdDEyLnBkZk8RAVgAAAAAAVgAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////w5saXUtbXNzdDEyLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABTAAAAgA2LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpMOmxpdS1tc3N0MTIucGRmAA4AHgAOAGwAaQB1AC0AbQBzAHMAdAAxADIALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACEvTXkgRHJpdmUvUGFwZXJzL0wvbGl1LW1zc3QxMi5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkADcAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABkw==},
+ bdsk-url-1 = {http://www.mcs.anl.gov/uploads/cels/papers/P2070-0312.pdf},
+ booktitle = {MSST/SNAPI 2012},
+ date-added = {2012-03-14 14:37:23 +0000},
+ date-modified = {2020-01-05 05:31:12 -0700},
+ keywords = {papers, burstbuffer, simulation, hpc, distributed},
+ month = {April 16 - 20},
+ title = {On the Role of Burst Buffers in Leadership-class Storage Systems},
+ year = {2012}
+}
+
diff --git a/content/publication/liu-msst-12/index.md b/content/publication/liu-msst-12/index.md
new file mode 100644
index 00000000000..03e58c42398
--- /dev/null
+++ b/content/publication/liu-msst-12/index.md
@@ -0,0 +1,14 @@
+---
+title: "On the Role of Burst Buffers in Leadership-class Storage Systems"
+date: 2012-04-01
+publishDate: 2020-01-05T13:33:05.974919Z
+authors: ["Ning Liu", "Jason Cope", "Philip Carns", "Christopher Carothers", "Robert Ross", "Gary Grider", "Adam Crume", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: "The largest-scale high-performance (HPC) systems are stretching parallel file systems to their limits in terms of aggregate bandwidth and numbers of clients. To further sustain the scalability of these file systems, researchers and HPC storage architects are exploring various storage system designs. One proposed storage system design integrates a tier of solid-state burst buffers into the storage system to absorb application I/O requests. In this paper, we simulate and explore this storage system design for use by large-scale HPC systems. First, we examine application I/O patterns on an existing large-scale HPC system to identify common burst patterns. Next, we describe enhancements to the CODES storage system simulator to enable our burst buffer simulations. These enhancements include the integration of a burst buffer model into the I/O forwarding layer of the simulator, the development of an I/O kernel description language and interpreter, the development of a suite of I/O kernels that are derived from observed I/O patterns, and fidelity improvements to the CODES models. We evaluate the I/O performance for a set of multiapplication I/O workloads and burst buffer configurations. We show that burst buffers can accelerate the application perceived throughput to the external storage system and can reduce the amount of external storage bandwidth required to meet a desired application perceived throughput goal."
+featured: false
+publication: "*MSST/SNAPI 2012*"
+tags: ["papers", "burstbuffer", "simulation", "hpc", "distributed"]
+projects:
+- storage-simulation
+---
+
diff --git a/content/publication/liu-ocpgs-19/cite.bib b/content/publication/liu-ocpgs-19/cite.bib
new file mode 100644
index 00000000000..81e78f5981a
--- /dev/null
+++ b/content/publication/liu-ocpgs-19/cite.bib
@@ -0,0 +1,12 @@
+@unpublished{liu:ocpgs19,
+ author = {Jianshen Liu and Philip Kufeldt and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAYTC9saXUtb2NwZ3MxOS1wb3N0ZXIucGRmTxEBeAAAAAABeAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////FmxpdS1vY3BnczE5LXBvc3Rlci5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFMAAACAD4vOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOkw6bGl1LW9jcGdzMTktcG9zdGVyLnBkZgAOAC4AFgBsAGkAdQAtAG8AYwBwAGcAcwAxADkALQBwAG8AcwB0AGUAcgAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAKS9NeSBEcml2ZS9QYXBlcnMvTC9saXUtb2NwZ3MxOS1wb3N0ZXIucGRmAAATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA/AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbs=},
+ date-added = {2019-05-06 18:39:54 -0700},
+ date-modified = {2020-01-04 21:29:52 -0700},
+ keywords = {shortpapers, eusocial, storagemedium, performance},
+ month = {March 14-15},
+ note = {Poster at OCP Global Summit 2019},
+ title = {Quantifying benefits of offloading data management to storage devices},
+ year = {2019}
+}
+
diff --git a/content/publication/liu-ocpgs-19/index.md b/content/publication/liu-ocpgs-19/index.md
new file mode 100644
index 00000000000..a075b87d06b
--- /dev/null
+++ b/content/publication/liu-ocpgs-19/index.md
@@ -0,0 +1,12 @@
+---
+title: "Quantifying benefits of offloading data management to storage devices"
+date: 2019-03-01
+publishDate: 2020-01-05T06:43:50.426132Z
+authors: ["Jianshen Liu", "Philip Kufeldt", "Carlos Maltzahn"]
+publication_types: ["3"]
+abstract: ""
+featured: false
+publication: ""
+tags: ["shortpapers", "eusocial", "storagemedium", "performance"]
+---
+
diff --git a/content/publication/liu-ppam-11/cite.bib b/content/publication/liu-ppam-11/cite.bib
new file mode 100644
index 00000000000..5908859bc25
--- /dev/null
+++ b/content/publication/liu-ppam-11/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{liu:ppam11,
+ abstract = {Exascale supercomputers will have the potential for billion-way parallelism. While physical implementations of these systems are currently not available, HPC system designers can develop models of exascale systems to evaluate system design points. Modeling these systems and associated subsystems is a significant challenge. In this paper, we present the Co-design of Exascale Storage System (CODES) framework for evaluating exascale storage system design points. As part of our early work with CODES, we discuss the use of the CODES framework to simulate leadership-scale storage systems in a tractable amount of time using parallel discrete-event simulation. We describe the current storage system models and protocols included with the CODES framework and demonstrate the use of CODES through simulations of an existing petascale storage system.},
+ address = {Torun, Poland},
+ author = {Ning Liu and Christopher Carothers and Jason Cope and Philip Carns and Robert Ross and Adam Crume and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAQTC9saXUtcHBhbTExLnBkZk8RAVgAAAAAAVgAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////w5saXUtcHBhbTExLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABTAAAAgA2LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpMOmxpdS1wcGFtMTEucGRmAA4AHgAOAGwAaQB1AC0AcABwAGEAbQAxADEALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACEvTXkgRHJpdmUvUGFwZXJzL0wvbGl1LXBwYW0xMS5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkADcAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABkw==},
+ booktitle = {PPAM 2011},
+ date-added = {2012-01-17 01:13:05 +0000},
+ date-modified = {2020-01-05 05:32:41 -0700},
+ keywords = {papers, simulation, exascale, storage, systems, parallel, filesystems, hpc},
+ month = {September 11-14},
+ title = {Modeling a Leadership-scale Storage System},
+ year = {2011}
+}
+
diff --git a/content/publication/liu-ppam-11/index.md b/content/publication/liu-ppam-11/index.md
new file mode 100644
index 00000000000..1a71a1d4fc0
--- /dev/null
+++ b/content/publication/liu-ppam-11/index.md
@@ -0,0 +1,14 @@
+---
+title: "Modeling a Leadership-scale Storage System"
+date: 2011-09-01
+publishDate: 2020-01-05T13:33:05.978979Z
+authors: ["Ning Liu", "Christopher Carothers", "Jason Cope", "Philip Carns", "Robert Ross", "Adam Crume", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: "Exascale supercomputers will have the potential for billion-way parallelism. While physical implementations of these systems are currently not available, HPC system designers can develop models of exascale systems to evaluate system design points. Modeling these systems and associated subsystems is a significant challenge. In this paper, we present the Co-design of Exascale Storage System (CODES) framework for evaluating exascale storage system design points. As part of our early work with CODES, we discuss the use of the CODES framework to simulate leadership-scale storage systems in a tractable amount of time using parallel discrete-event simulation. We describe the current storage system models and protocols included with the CODES framework and demonstrate the use of CODES through simulations of an existing petascale storage system."
+featured: false
+publication: "*PPAM 2011*"
+tags: ["papers", "simulation", "exascale", "storage", "systems", "parallel", "filesystems", "hpc"]
+projects:
+- storage-simulation
+---
+
diff --git a/content/publication/lofstead-cluster-14-poster/cite.bib b/content/publication/lofstead-cluster-14-poster/cite.bib
new file mode 100644
index 00000000000..eeefb6879d2
--- /dev/null
+++ b/content/publication/lofstead-cluster-14-poster/cite.bib
@@ -0,0 +1,12 @@
+@inproceedings{lofstead:cluster14poster,
+ address = {Madrid, Spain},
+ author = {Jay Lofstead and Ivo Jimenez and Carlos Maltzahn and Quincey Koziol and John Bent and Eric Barton},
+ booktitle = {in Poster Session at IEEE Cluster 2014},
+ date-added = {2019-12-26 19:23:07 -0800},
+ date-modified = {2019-12-29 16:34:56 -0800},
+ keywords = {shortpapers, storage, parallel, hpc, exascale},
+ month = {September 22-26},
+ title = {An Innovative Storage Stack Addressing Extreme Scale Platforms and Big Data Applications},
+ year = {2014}
+}
+
diff --git a/content/publication/lofstead-cluster-14-poster/index.md b/content/publication/lofstead-cluster-14-poster/index.md
new file mode 100644
index 00000000000..aca0fa67bc6
--- /dev/null
+++ b/content/publication/lofstead-cluster-14-poster/index.md
@@ -0,0 +1,12 @@
+---
+title: "An Innovative Storage Stack Addressing Extreme Scale Platforms and Big Data Applications"
+date: 2014-09-01
+publishDate: 2020-01-05T06:43:50.384829Z
+authors: ["Jay Lofstead", "Ivo Jimenez", "Carlos Maltzahn", "Quincey Koziol", "John Bent", "Eric Barton"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*in Poster Session at IEEE Cluster 2014*"
+tags: ["shortpapers", "storage", "parallel", "hpc", "exascale"]
+---
+
diff --git a/content/publication/lofstead-discs-14/cite.bib b/content/publication/lofstead-discs-14/cite.bib
new file mode 100644
index 00000000000..ff45f33e65b
--- /dev/null
+++ b/content/publication/lofstead-discs-14/cite.bib
@@ -0,0 +1,15 @@
+@inproceedings{lofstead:discs14,
+ abstract = {Scientific simulations are moving away from using centralized persistent storage for intermediate data between workflow steps towards an all online model. This shift is motivated by the relatively slow IO bandwidth growth compared with compute speed increases. The challenges presented by this shift to Integrated Application Workflows are motivated by the loss of persistent storage semantics for node-to-node communication. One step towards addressing this semantics gap is using transac- tions to logically delineate a data set from 100,000s of processes to 1000s of servers as an atomic unit.
+Our previously demonstrated Doubly Distributed Transac- tions (D2T) protocol showed a high-performance solution, but had not explored how to detect and recover from faults. Instead, the focus was on demonstrating high-performance typical case performance. The research presented here addresses fault detec- tion and recovery based on the enhanced protocol design. The total overhead for a full transaction with multiple operations at 65,536 processes is on average 0.055 seconds. Fault detection and recovery mechanisms demonstrate similar performance to the success case with only the addition of appropriate timeouts for the system. This paper explores the challenges in designing a recoverable protocol for doubly distributed transactions, partic- ularly for parallel computing environments.},
+ address = {New Orleans, LA},
+ author = {Jay Lofstead and Jai Dayal and Ivo Jimenez and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAWTC9sb2ZzdGVhZC1kaXNjczE0LnBkZk8RAXAAAAAAAXAAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xRsb2ZzdGVhZC1kaXNjczE0LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABTAAAAgA8LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpMOmxvZnN0ZWFkLWRpc2NzMTQucGRmAA4AKgAUAGwAbwBmAHMAdABlAGEAZAAtAGQAaQBzAGMAcwAxADQALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACcvTXkgRHJpdmUvUGFwZXJzL0wvbG9mc3RlYWQtZGlzY3MxNC5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAD0AAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABsQ==},
+ booktitle = {The 2014 International Workshop on Data-Intensive Scalable Computing Systems (DISCS-2014) (Workshop co-located with Supercomputing 2014)},
+ date-added = {2019-12-26 16:14:45 -0800},
+ date-modified = {2020-01-04 21:18:57 -0700},
+ keywords = {papers, datamanagement, hpc},
+ month = {November 16},
+ title = {Efficient, Failure Resilient Transactions for Parallel and Distributed Computing},
+ year = {2014}
+}
+
diff --git a/content/publication/lofstead-discs-14/index.md b/content/publication/lofstead-discs-14/index.md
new file mode 100644
index 00000000000..bf7571b2cef
--- /dev/null
+++ b/content/publication/lofstead-discs-14/index.md
@@ -0,0 +1,12 @@
+---
+title: "Efficient, Failure Resilient Transactions for Parallel and Distributed Computing"
+date: 2014-11-01
+publishDate: 2020-01-05T06:43:50.403341Z
+authors: ["Jay Lofstead", "Jai Dayal", "Ivo Jimenez", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: "Scientific simulations are moving away from using centralized persistent storage for intermediate data between workflow steps towards an all online model. This shift is motivated by the relatively slow IO bandwidth growth compared with compute speed increases. The challenges presented by this shift to Integrated Application Workflows are motivated by the loss of persistent storage semantics for node-to-node communication. One step towards addressing this semantics gap is using transac- tions to logically delineate a data set from 100,000s of processes to 1000s of servers as an atomic unit. Our previously demonstrated Doubly Distributed Transac- tions (D2T) protocol showed a high-performance solution, but had not explored how to detect and recover from faults. Instead, the focus was on demonstrating high-performance typical case performance. The research presented here addresses fault detec- tion and recovery based on the enhanced protocol design. The total overhead for a full transaction with multiple operations at 65,536 processes is on average 0.055 seconds. Fault detection and recovery mechanisms demonstrate similar performance to the success case with only the addition of appropriate timeouts for the system. This paper explores the challenges in designing a recoverable protocol for doubly distributed transactions, partic- ularly for parallel computing environments."
+featured: false
+publication: "*The 2014 International Workshop on Data-Intensive Scalable Computing Systems (DISCS-2014) (Workshop co-located with Supercomputing 2014)*"
+tags: ["papers", "datamanagement", "hpc"]
+---
+
diff --git a/content/publication/lofstead-iasds-14/cite.bib b/content/publication/lofstead-iasds-14/cite.bib
new file mode 100644
index 00000000000..645e3e4c683
--- /dev/null
+++ b/content/publication/lofstead-iasds-14/cite.bib
@@ -0,0 +1,16 @@
+@inproceedings{lofstead:iasds14,
+ abstract = {The DOE Extreme-Scale Technology Acceleration Fast Forward Storage and IO Stack project is going to have significant impact on storage systems design within and beyond the HPC community. With phase 1 of the project complete, it is an excellent opportunity to evaluate many of the decisions made to feed into the phase 2 effort. With this paper we not only provide a timely summary of important aspects of the design specifications but also capture the underlying reasoning that is not available elsewhere.
+The initial effort to define a next generation storage system has made admirable contributions in architecture and design. Formalizing the general idea of data staging into burst buffers for the storage system will help manage the performance variability and offer additional data processing opportunities outside the main compute and storage system. Adding a transactional mech- anism to manage faults and data visibility helps enable effective analytics without having to work around the IO stack semantics. While these and other contributions are valuable, similar efforts made elsewhere may offer attractive alternatives or differing semantics that could yield a more feature rich environment with little to no additional overhead. For example, the Doubly Distributed Transactions (D2T) protocol offers an alternative approach for incorporating transactional semantics into the data path. Another project, PreDatA, examined how to get the best throughput for data operators and may offer additional insights into further refinements of the Burst Buffer concept.
+This paper examines some of the choices made by the Fast Forward team and compares them with other options and offers observations and suggestions based on these other efforts. This will include some non-core contributions of other projects, such as some of the demonstration metadata and data storage components generated while implementing D2T, to make suggestions that may help the next generation design for how the IO stack works as a whole.},
+ address = {Minneapolis, MN},
+ author = {Jay Lofstead and Ivo Jimenez and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAWTC9sb2ZzdGVhZC1pYXNkczE0LnBkZk8RAXAAAAAAAXAAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xRsb2ZzdGVhZC1pYXNkczE0LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABTAAAAgA8LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpMOmxvZnN0ZWFkLWlhc2RzMTQucGRmAA4AKgAUAGwAbwBmAHMAdABlAGEAZAAtAGkAYQBzAGQAcwAxADQALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACcvTXkgRHJpdmUvUGFwZXJzL0wvbG9mc3RlYWQtaWFzZHMxNC5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAD0AAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABsQ==},
+ booktitle = {Workshop on Interfaces and Architectures for Scientific Data Storage (IASDS 2014)},
+ date-added = {2019-12-26 16:17:49 -0800},
+ date-modified = {2020-01-04 23:08:26 -0700},
+ keywords = {papers, datamanagement, hpc},
+ month = {September 9-12},
+ title = {Consistency and Fault Tolerance Considerations for the Next Iteration of the DOE Fast Forward Storage and IO Project},
+ year = {2014}
+}
+
diff --git a/content/publication/lofstead-iasds-14/index.md b/content/publication/lofstead-iasds-14/index.md
new file mode 100644
index 00000000000..780ecf20e4c
--- /dev/null
+++ b/content/publication/lofstead-iasds-14/index.md
@@ -0,0 +1,12 @@
+---
+title: "Consistency and Fault Tolerance Considerations for the Next Iteration of the DOE Fast Forward Storage and IO Project"
+date: 2014-09-01
+publishDate: 2020-01-05T06:43:50.401922Z
+authors: ["Jay Lofstead", "Ivo Jimenez", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: "The DOE Extreme-Scale Technology Acceleration Fast Forward Storage and IO Stack project is going to have significant impact on storage systems design within and beyond the HPC community. With phase 1 of the project complete, it is an excellent opportunity to evaluate many of the decisions made to feed into the phase 2 effort. With this paper we not only provide a timely summary of important aspects of the design specifications but also capture the underlying reasoning that is not available elsewhere. The initial effort to define a next generation storage system has made admirable contributions in architecture and design. Formalizing the general idea of data staging into burst buffers for the storage system will help manage the performance variability and offer additional data processing opportunities outside the main compute and storage system. Adding a transactional mech- anism to manage faults and data visibility helps enable effective analytics without having to work around the IO stack semantics. While these and other contributions are valuable, similar efforts made elsewhere may offer attractive alternatives or differing semantics that could yield a more feature rich environment with little to no additional overhead. For example, the Doubly Distributed Transactions (D2T) protocol offers an alternative approach for incorporating transactional semantics into the data path. Another project, PreDatA, examined how to get the best throughput for data operators and may offer additional insights into further refinements of the Burst Buffer concept. This paper examines some of the choices made by the Fast Forward team and compares them with other options and offers observations and suggestions based on these other efforts. This will include some non-core contributions of other projects, such as some of the demonstration metadata and data storage components generated while implementing D2T, to make suggestions that may help the next generation design for how the IO stack works as a whole."
+featured: false
+publication: "*Workshop on Interfaces and Architectures for Scientific Data Storage (IASDS 2014)*"
+tags: ["papers", "datamanagement", "hpc"]
+---
+
diff --git a/content/publication/lofstead-pdsw-13/cite.bib b/content/publication/lofstead-pdsw-13/cite.bib
new file mode 100644
index 00000000000..c3aa97b80b2
--- /dev/null
+++ b/content/publication/lofstead-pdsw-13/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{lofstead:pdsw13,
+ abstract = {The rise of Integrated Application Workflows (IAWs) for processing data prior to storage on persistent media prompts the need to incorporate features that reproduce many of the semantics of persistent storage devices. One such feature is the ability to manage data sets as chunks with natural barriers between different data sets. Towards that end, we need a mechanism to ensure that data moved to an intermediate storage area is both complete and correct before allowing access by other processing components. The Dou- bly Distributed Transactions (D2T) protocol offers such a mechanism. The initial development [9] suffered from scal- ability limitations and undue requirements on server processes. The current version has addressed these limitations and has demonstrated scalability with low overhead.},
+ address = {Denver, CO},
+ author = {Jay Lofstead and Jai Dayal and Ivo Jimenez and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAVTC9sb2ZzdGVhZC1wZHN3MTMucGRmTxEBbAAAAAABbAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////E2xvZnN0ZWFkLXBkc3cxMy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFMAAACADsvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOkw6bG9mc3RlYWQtcGRzdzEzLnBkZgAADgAoABMAbABvAGYAcwB0AGUAYQBkAC0AcABkAHMAdwAxADMALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACYvTXkgRHJpdmUvUGFwZXJzL0wvbG9mc3RlYWQtcGRzdzEzLnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA8AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAaw=},
+ booktitle = {8th Parallel Data Storage Workshop at Supercomputing '13 (PDSW 2013)},
+ date-added = {2019-12-26 16:21:31 -0800},
+ date-modified = {2020-01-04 21:17:41 -0700},
+ keywords = {papers, transactions, datamanagement, hpc},
+ month = {November 18},
+ title = {Efficient Transactions for Parallel Data Movement},
+ year = {2013}
+}
+
diff --git a/content/publication/lofstead-pdsw-13/index.md b/content/publication/lofstead-pdsw-13/index.md
new file mode 100644
index 00000000000..d7a5f396e1f
--- /dev/null
+++ b/content/publication/lofstead-pdsw-13/index.md
@@ -0,0 +1,12 @@
+---
+title: "Efficient Transactions for Parallel Data Movement"
+date: 2013-11-01
+publishDate: 2020-01-05T06:43:50.400495Z
+authors: ["Jay Lofstead", "Jai Dayal", "Ivo Jimenez", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: "The rise of Integrated Application Workflows (IAWs) for processing data prior to storage on persistent media prompts the need to incorporate features that reproduce many of the semantics of persistent storage devices. One such feature is the ability to manage data sets as chunks with natural barriers between different data sets. Towards that end, we need a mechanism to ensure that data moved to an intermediate storage area is both complete and correct before allowing access by other processing components. The Dou- bly Distributed Transactions (D2T) protocol offers such a mechanism. The initial development [9] suffered from scal- ability limitations and undue requirements on server processes. The current version has addressed these limitations and has demonstrated scalability with low overhead."
+featured: false
+publication: "*8th Parallel Data Storage Workshop at Supercomputing '13 (PDSW 2013)*"
+tags: ["papers", "transactions", "datamanagement", "hpc"]
+---
+
diff --git a/content/publication/lofstead-sc-16/cite.bib b/content/publication/lofstead-sc-16/cite.bib
new file mode 100644
index 00000000000..2e3def3c080
--- /dev/null
+++ b/content/publication/lofstead-sc-16/cite.bib
@@ -0,0 +1,15 @@
+@inproceedings{lofstead:sc16,
+ abstract = {The DOE Extreme-Scale Technology Acceleration Fast Forward Storage and IO Stack project is going to have significant impact on storage systems design within and beyond the HPC community. With phase two of the project starting, it is an excellent opportunity to explore the complete design and how it will address the needs of extreme scale platforms. This paper examines each layer of the proposed stack in some detail along with cross-cutting topics, such as transactions and metadata management.
+This paper not only provides a timely summary of important aspects of the design specifications but also captures the under- lying reasoning that is not available elsewhere. We encourage the broader community to understand the design, intent, and future directions to foster discussion guiding phase two and the ultimate production storage stack based on this work. An initial performance evaluation of the early prototype implementation is also provided to validate the presented design.},
+ address = {Salt Lake City, UT},
+ author = {Jay Lofstead and Ivo Jimenez and Carlos Maltzahn and Quincey Koziol and John Bent and Eric Barton},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxATTC9sb2ZzdGVhZC1zYzE2LnBkZk8RAWQAAAAAAWQAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xFsb2ZzdGVhZC1zYzE2LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABTAAAAgA5LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpMOmxvZnN0ZWFkLXNjMTYucGRmAAAOACQAEQBsAG8AZgBzAHQAZQBhAGQALQBzAGMAMQA2AC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAkL015IERyaXZlL1BhcGVycy9ML2xvZnN0ZWFkLXNjMTYucGRmABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkADoAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABog==},
+ booktitle = {29th ACM and IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC16)},
+ date-added = {2019-12-26 15:58:41 -0800},
+ date-modified = {2020-01-04 21:19:51 -0700},
+ keywords = {papers, parallel, storage, hpc, exascale},
+ month = {November 13-18},
+ title = {DAOS and Friends: A Proposal for an Exascale Storage System},
+ year = {2016}
+}
+
diff --git a/content/publication/lofstead-sc-16/index.md b/content/publication/lofstead-sc-16/index.md
new file mode 100644
index 00000000000..f861da2d841
--- /dev/null
+++ b/content/publication/lofstead-sc-16/index.md
@@ -0,0 +1,12 @@
+---
+title: "DAOS and Friends: A Proposal for an Exascale Storage System"
+date: 2016-11-01
+publishDate: 2020-01-05T06:43:50.405973Z
+authors: ["Jay Lofstead", "Ivo Jimenez", "Carlos Maltzahn", "Quincey Koziol", "John Bent", "Eric Barton"]
+publication_types: ["1"]
+abstract: "The DOE Extreme-Scale Technology Acceleration Fast Forward Storage and IO Stack project is going to have significant impact on storage systems design within and beyond the HPC community. With phase two of the project starting, it is an excellent opportunity to explore the complete design and how it will address the needs of extreme scale platforms. This paper examines each layer of the proposed stack in some detail along with cross-cutting topics, such as transactions and metadata management. This paper not only provides a timely summary of important aspects of the design specifications but also captures the under- lying reasoning that is not available elsewhere. We encourage the broader community to understand the design, intent, and future directions to foster discussion guiding phase two and the ultimate production storage stack based on this work. An initial performance evaluation of the early prototype implementation is also provided to validate the presented design."
+featured: false
+publication: "*29th ACM and IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC16)*"
+tags: ["papers", "parallel", "storage", "hpc", "exascale"]
+---
+
diff --git a/content/publication/malik-precs-22/cite.bib b/content/publication/malik-precs-22/cite.bib
new file mode 100644
index 00000000000..d40ee8dcb23
--- /dev/null
+++ b/content/publication/malik-precs-22/cite.bib
@@ -0,0 +1,12 @@
+@inproceedings{malik:precs22,
+ author = {Tanu Malik and Anjo Vahldiek-Oberwagner and Ivo Jimenez and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAsLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL00vbWFsaWstcHJlY3MyMi5wZGZPEQFkAAAAAAFkAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADfryN4QkQAAf////8RbWFsaWstcHJlY3MyMi5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9/k1PsAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAU0AAAIANi86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6TTptYWxpay1wcmVjczIyLnBkZgAOACQAEQBtAGEAbABpAGsALQBwAHIAZQBjAHMAMgAyAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA0VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvTS9tYWxpay1wcmVjczIyLnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABTAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbs=},
+ booktitle = {P-RECS'22},
+ date-added = {2023-01-11 21:05:52 -0800},
+ date-modified = {2023-01-11 21:07:18 -0800},
+ doi = {10.1145/3526062.3536354},
+ keywords = {reproducibility},
+ title = {Expanding the Scope of Artifact Evaluation at HPC Conferences: Experience of SC21},
+ year = {2022}
+}
+
diff --git a/content/publication/malik-precs-22/index.md b/content/publication/malik-precs-22/index.md
new file mode 100644
index 00000000000..fe0a49215ef
--- /dev/null
+++ b/content/publication/malik-precs-22/index.md
@@ -0,0 +1,13 @@
+---
+title: "Expanding the Scope of Artifact Evaluation at HPC Conferences: Experience of SC21"
+date: 2022-01-01
+publishDate: 2023-01-26T14:23:16.863252Z
+authors: ["Tanu Malik", "Anjo Vahldiek-Oberwagner", "Ivo Jimenez", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*P-RECS'22*"
+tags: ["reproducibility"]
+doi: "10.1145/3526062.3536354"
+---
+
diff --git a/content/publication/maltzahn-chi-95/cite.bib b/content/publication/maltzahn-chi-95/cite.bib
new file mode 100644
index 00000000000..23bac2661a9
--- /dev/null
+++ b/content/publication/maltzahn-chi-95/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{maltzahn:chi95,
+ abstract = {In a research community each research er knows only a small fraction of the vast number of tools offered in the continually changing environment of local computer networks. Since the on-line or off-line documentation for these tools poorly support people in finding the best tool for a given task, users prefer to ask colleagues. however, finding the right person to ask can be time consuming and asking questions can reveal incompetence. In this paper we present an architecture to a community sensitive help system which actively collects information about Unix tools by tapping into accounting information generated by the operating system and by interviewing users that are selected on the basis of collected information. The result is a help system that continually seeks to update itself, that contains information that is entirely based on the community's perspective on tools, and that consequently grows with the community and its dynamic environments.},
+ address = {Denver, CO},
+ author = {Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAUTS9tYWx0emFobi1jaGk5NS5wZGZPEQFoAAAAAAFoAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8SbWFsdHphaG4tY2hpOTUucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAU0AAAIAOi86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6TTptYWx0emFobi1jaGk5NS5wZGYADgAmABIAbQBhAGwAdAB6AGEAaABuAC0AYwBoAGkAOQA1AC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAlL015IERyaXZlL1BhcGVycy9NL21hbHR6YWhuLWNoaTk1LnBkZgAAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAOwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGn},
+ booktitle = {CHI '95},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:06:12 -0700},
+ keywords = {papers, cscw},
+ month = {May},
+ title = {Community Help: Discovering Tools and Locating Experts in a Dynamic Environment},
+ year = {1995}
+}
+
diff --git a/content/publication/maltzahn-chi-95/index.md b/content/publication/maltzahn-chi-95/index.md
new file mode 100644
index 00000000000..cda54470117
--- /dev/null
+++ b/content/publication/maltzahn-chi-95/index.md
@@ -0,0 +1,12 @@
+---
+title: "Community Help: Discovering Tools and Locating Experts in a Dynamic Environment"
+date: 1995-05-01
+publishDate: 2020-01-05T13:33:05.999401Z
+authors: ["Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: "In a research community each research er knows only a small fraction of the vast number of tools offered in the continually changing environment of local computer networks. Since the on-line or off-line documentation for these tools poorly support people in finding the best tool for a given task, users prefer to ask colleagues. however, finding the right person to ask can be time consuming and asking questions can reveal incompetence. In this paper we present an architecture to a community sensitive help system which actively collects information about Unix tools by tapping into accounting information generated by the operating system and by interviewing users that are selected on the basis of collected information. The result is a help system that continually seeks to update itself, that contains information that is entirely based on the community's perspective on tools, and that consequently grows with the community and its dynamic environments."
+featured: false
+publication: "*CHI '95*"
+tags: ["papers", "cscw"]
+---
+
diff --git a/content/publication/maltzahn-cutr-99/cite.bib b/content/publication/maltzahn-cutr-99/cite.bib
new file mode 100644
index 00000000000..fd8e738cff1
--- /dev/null
+++ b/content/publication/maltzahn-cutr-99/cite.bib
@@ -0,0 +1,16 @@
+@techreport{maltzahn:cutr99,
+ abstract = {The bandwidth usage due to HTTP traffic often varies considerably over the course of a day, requiring high network performance during peak periods while leaving network resources unused during off-peak periods. We show that using these extra network resources to prefetch web content during off-peak periods can significantly reduce peak bandwidth usage without compromising cache consistency. With large HTTP traffic variations it is therefore feasible to apply ``bandwidth smoothing'' to reduce the cost and the required capacity of a network infrastructure. In addition to reducing the peak network demand, bandwidth smoothing improves cache hit rates.
+We calculate the potential reduction in bandwidth for a given bandwidth usage profile, and show that a simple hueristic has poor prefetch accuracy. We then apply machine learning techniques to automatically develop prefetch strategies that have high accuracy. Our results are based on web proxy traces generated at a large corporate Internet exchange point and data collected from recent scans of popular web sites.},
+ author = {Carlos Maltzahn and Kathy Richardson and Dirk Grunwald and James Martin},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAVTS9tYWx0emFobi1jdXRyOTkucGRmTxEBbAAAAAABbAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////E21hbHR6YWhuLWN1dHI5OS5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFNAAACADsvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOk06bWFsdHphaG4tY3V0cjk5LnBkZgAADgAoABMAbQBhAGwAdAB6AGEAaABuAC0AYwB1AHQAcgA5ADkALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACYvTXkgRHJpdmUvUGFwZXJzL00vbWFsdHphaG4tY3V0cjk5LnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA8AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAaw=},
+ date-added = {2012-12-07 22:58:31 +0000},
+ date-modified = {2020-01-05 05:26:05 -0700},
+ institution = {Dept. of Computer Science, University of Colorado at Boulder},
+ keywords = {papers, prefetching, caching, machinelearning, networking, intermediary},
+ month = {January},
+ number = {CU-CS-879-99},
+ title = {A Feasibility Study of Bandwidth Smoothing on the World-Wide Web Using Machine Learning},
+ type = {Technical Report},
+ year = {1999}
+}
+
diff --git a/content/publication/maltzahn-cutr-99/index.md b/content/publication/maltzahn-cutr-99/index.md
new file mode 100644
index 00000000000..d022ff92c6f
--- /dev/null
+++ b/content/publication/maltzahn-cutr-99/index.md
@@ -0,0 +1,12 @@
+---
+title: "A Feasibility Study of Bandwidth Smoothing on the World-Wide Web Using Machine Learning"
+date: 1999-01-01
+publishDate: 2020-01-05T12:39:43.034435Z
+authors: ["Carlos Maltzahn", "Kathy Richardson", "Dirk Grunwald", "James Martin"]
+publication_types: ["4"]
+abstract: "The bandwidth usage due to HTTP traffic often varies considerably over the course of a day, requiring high network performance during peak periods while leaving network resources unused during off-peak periods. We show that using these extra network resources to prefetch web content during off-peak periods can significantly reduce peak bandwidth usage without compromising cache consistency. With large HTTP traffic variations it is therefore feasible to apply ``bandwidth smoothing'' to reduce the cost and the required capacity of a network infrastructure. In addition to reducing the peak network demand, bandwidth smoothing improves cache hit rates. We calculate the potential reduction in bandwidth for a given bandwidth usage profile, and show that a simple hueristic has poor prefetch accuracy. We then apply machine learning techniques to automatically develop prefetch strategies that have high accuracy. Our results are based on web proxy traces generated at a large corporate Internet exchange point and data collected from recent scans of popular web sites."
+featured: false
+publication: ""
+tags: ["papers", "prefetching", "caching", "machinelearning", "networking", "intermediary"]
+---
+
diff --git a/content/publication/maltzahn-ddas-07/cite.bib b/content/publication/maltzahn-ddas-07/cite.bib
new file mode 100644
index 00000000000..b67af34ab54
--- /dev/null
+++ b/content/publication/maltzahn-ddas-07/cite.bib
@@ -0,0 +1,18 @@
+@article{maltzahn:ddas07,
+ abstract = {Managing storage in the face of relentless growth in the number and variety of files on storage systems creates demand for rich file system metadata as is made evident by the recent emergence of rich metadata support in many applications as well as file systems. Yet, little support exists for sharing metadata across file systems even though it is not uncommon for users to manage multiple file systems and to frequently share copies of files across devices and with other users. Encouraged by the surge in popularity for collaborative bookmarking sites that share the burden of creating metadata for online content [21] we present Graffiti, a distributed organization layer for collaboratively sharing rich metadata across heterogeneous file systems. The primary purpose of Graffiti is to provide a research and rapid prototyping platform for managing metadata across file systems and users.},
+ author = {Carlos Maltzahn and Nikhil Bobb and Mark W. Storer and Damian Eads and Scott A. Brandt and Ethan L. Miller},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAVTS9tYWx0emFobi1kZGFzMDcucGRmTxEBbAAAAAABbAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////E21hbHR6YWhuLWRkYXMwNy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFNAAACADsvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOk06bWFsdHphaG4tZGRhczA3LnBkZgAADgAoABMAbQBhAGwAdAB6AGEAaABuAC0AZABkAGEAcwAwADcALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACYvTXkgRHJpdmUvUGFwZXJzL00vbWFsdHphaG4tZGRhczA3LnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA8AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAaw=},
+ booktitle = {Distributed Data & Structures 7},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:13:12 -0700},
+ editor = {Thomas Schwarz},
+ journal = {Proceedings in Informatics},
+ keywords = {papers, pim, tagging, distributed, naming, linking, metadata},
+ pages = {97-111},
+ publisher = {Carleton Scientific},
+ read = {Yes},
+ title = {Graffiti: A Framework for Testing Collaborative Distributed Metadata},
+ volume = {21},
+ year = {2007}
+}
+
diff --git a/content/publication/maltzahn-ddas-07/index.md b/content/publication/maltzahn-ddas-07/index.md
new file mode 100644
index 00000000000..7d15b03cb42
--- /dev/null
+++ b/content/publication/maltzahn-ddas-07/index.md
@@ -0,0 +1,12 @@
+---
+title: "Graffiti: A Framework for Testing Collaborative Distributed Metadata"
+date: 2007-01-01
+publishDate: 2020-01-05T13:33:06.007474Z
+authors: ["Carlos Maltzahn", "Nikhil Bobb", "Mark W. Storer", "Damian Eads", "Scott A. Brandt", "Ethan L. Miller"]
+publication_types: ["2"]
+abstract: "Managing storage in the face of relentless growth in the number and variety of files on storage systems creates demand for rich file system metadata as is made evident by the recent emergence of rich metadata support in many applications as well as file systems. Yet, little support exists for sharing metadata across file systems even though it is not uncommon for users to manage multiple file systems and to frequently share copies of files across devices and with other users. Encouraged by the surge in popularity for collaborative bookmarking sites that share the burden of creating metadata for online content [21] we present Graffiti, a distributed organization layer for collaboratively sharing rich metadata across heterogeneous file systems. The primary purpose of Graffiti is to provide a research and rapid prototyping platform for managing metadata across file systems and users."
+featured: false
+publication: "*Distributed Data & Structures 7*"
+tags: ["papers", "pim", "tagging", "distributed", "naming", "linking", "metadata"]
+---
+
diff --git a/content/publication/maltzahn-fast-08-wip/cite.bib b/content/publication/maltzahn-fast-08-wip/cite.bib
new file mode 100644
index 00000000000..7a5a6e7fc99
--- /dev/null
+++ b/content/publication/maltzahn-fast-08-wip/cite.bib
@@ -0,0 +1,13 @@
+@inproceedings{maltzahn:fast08wip,
+ address = {San Jose, CA},
+ author = {Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAYTS9tYWx0emFobi1mYXN0MDh3aXAucGRmTxEBeAAAAAABeAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////Fm1hbHR6YWhuLWZhc3QwOHdpcC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFNAAACAD4vOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOk06bWFsdHphaG4tZmFzdDA4d2lwLnBkZgAOAC4AFgBtAGEAbAB0AHoAYQBoAG4ALQBmAGEAcwB0ADAAOAB3AGkAcAAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAKS9NeSBEcml2ZS9QYXBlcnMvTS9tYWx0emFobi1mYXN0MDh3aXAucGRmAAATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA/AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbs=},
+ booktitle = {Work-in-Progress Session of the USENIX Conference on File and Storage Technology (FAST 2008)},
+ date-added = {2019-12-29 16:18:24 -0800},
+ date-modified = {2020-01-04 20:29:07 -0700},
+ keywords = {shortpapers, filesystems, metadata, pim},
+ month = {February 26-29},
+ title = {How Private are Home Directories?},
+ year = {2008}
+}
+
diff --git a/content/publication/maltzahn-fast-08-wip/index.md b/content/publication/maltzahn-fast-08-wip/index.md
new file mode 100644
index 00000000000..a5537789c1a
--- /dev/null
+++ b/content/publication/maltzahn-fast-08-wip/index.md
@@ -0,0 +1,12 @@
+---
+title: "How Private are Home Directories?"
+date: 2008-02-01
+publishDate: 2020-01-05T06:43:50.377145Z
+authors: ["Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*Work-in-Progress Session of the USENIX Conference on File and Storage Technology (FAST 2008)*"
+tags: ["shortpapers", "filesystems", "metadata", "pim"]
+---
+
diff --git a/content/publication/maltzahn-fast-10/cite.bib b/content/publication/maltzahn-fast-10/cite.bib
new file mode 100644
index 00000000000..29e824aed12
--- /dev/null
+++ b/content/publication/maltzahn-fast-10/cite.bib
@@ -0,0 +1,16 @@
+@inproceedings{maltzahn:fast10,
+ address = {San Jose, CA},
+ author = {Carlos Maltzahn and Michael Mateas and Jim Whitehead},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAYTS9tYWx0emFobi1mYXN0d2lwMTAucGRmTxEBeAAAAAABeAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////Fm1hbHR6YWhuLWZhc3R3aXAxMC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFNAAACAD4vOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOk06bWFsdHphaG4tZmFzdHdpcDEwLnBkZgAOAC4AFgBtAGEAbAB0AHoAYQBoAG4ALQBmAGEAcwB0AHcAaQBwADEAMAAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAKS9NeSBEcml2ZS9QYXBlcnMvTS9tYWx0emFobi1mYXN0d2lwMTAucGRmAAATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA/AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbs=},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAeTS9tYWx0emFobi1mYXN0d2lwMTBzbGlkZXMucGRmTxEBkAAAAAABkAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////HG1hbHR6YWhuLWZhc3R3aXAxMHNsaWRlcy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFNAAACAEQvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOk06bWFsdHphaG4tZmFzdHdpcDEwc2xpZGVzLnBkZgAOADoAHABtAGEAbAB0AHoAYQBoAG4ALQBmAGEAcwB0AHcAaQBwADEAMABzAGwAaQBkAGUAcwAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIALy9NeSBEcml2ZS9QYXBlcnMvTS9tYWx0emFobi1mYXN0d2lwMTBzbGlkZXMucGRmAAATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJABFAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAdk=},
+ bdsk-file-3 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAeTS9tYWx0emFobi1mYXN0d2lwMTBwb3N0ZXIucGRmTxEBkAAAAAABkAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////HG1hbHR6YWhuLWZhc3R3aXAxMHBvc3Rlci5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFNAAACAEQvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOk06bWFsdHphaG4tZmFzdHdpcDEwcG9zdGVyLnBkZgAOADoAHABtAGEAbAB0AHoAYQBoAG4ALQBmAGEAcwB0AHcAaQBwADEAMABwAG8AcwB0AGUAcgAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIALy9NeSBEcml2ZS9QYXBlcnMvTS9tYWx0emFobi1mYXN0d2lwMTBwb3N0ZXIucGRmAAATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJABFAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAdk=},
+ bdsk-url-1 = {http://www.cs.ucsc.edu/%7Ecarlosm/Infogarden/FAST_2010_WiP_Talk.html},
+ booktitle = {Work-in-Progress and Poster Session at FAST'10},
+ date-added = {2010-03-01 16:46:58 -0800},
+ date-modified = {2019-04-20 22:49:33 -0700},
+ keywords = {papers, casual, games, ir, datamanagement, pim},
+ month = {February 24-27},
+ title = {InfoGarden: A Casual-Game Approach to Digital Archive Management},
+ year = {2010}
+}
+
diff --git a/content/publication/maltzahn-fast-10/index.md b/content/publication/maltzahn-fast-10/index.md
new file mode 100644
index 00000000000..dd2646dc56f
--- /dev/null
+++ b/content/publication/maltzahn-fast-10/index.md
@@ -0,0 +1,16 @@
+---
+title: "InfoGarden: A Casual-Game Approach to Digital Archive Management"
+date: 2010-02-01
+publishDate: 2020-01-05T06:43:50.599421Z
+authors: ["Carlos Maltzahn", "Michael Mateas", "Jim Whitehead"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*Work-in-Progress and Poster Session at FAST'10*"
+url_pdf: "http://users.soe.ucsc.edu/~carlosm/Papers/maltzahn-fast10abstract.pdf"
+url_slides: "http://users.soe.ucsc.edu/~carlosm/Papers/maltzahn-fast10wip.pdf"
+url_video: "http://www.cs.ucsc.edu/%7Ecarlosm/Infogarden/FAST_2010_WiP_Talk.html"
+url_poster: "http://users.soe.ucsc.edu/~carlosm/Papers/maltzahn-fast10poster.pdf"
+tags: ["papers", "casual", "games", "ir", "datamanagement", "pim"]
+---
+
diff --git a/content/publication/maltzahn-gamifir-14/cite.bib b/content/publication/maltzahn-gamifir-14/cite.bib
new file mode 100644
index 00000000000..7ec11fe0f19
--- /dev/null
+++ b/content/publication/maltzahn-gamifir-14/cite.bib
@@ -0,0 +1,15 @@
+@inproceedings{maltzahn:gamifir14,
+ abstract = {The super-exponential growth of digital data world-wide is matched by personal digital archives containing songs, ebooks, audio books, photos, movies, textual documents, and documents of other media types. For many types of media it is usually a lot easier to add items than to keep archives from falling into disarray and incurring data loss. The overhead of maintaining these personal archives frequently surpasses the time and patience their owners are willing to dedicate to this important task. The promise of gamification in this context is to significantly extend the willingness to maintain personal archives by enhancing the experience of personal archive management.
+In this paper we focus on a subcategory of personal archives which we call private archives. These are archives that for a variety of reasons the owner does not want to make available online and which consequently limits archive maintenance to an individual activity and does not allow any form of crowdsourcing out of fear for unwanted information leaks. As an example of private digital archive maintenance gamification we describe InfoGarden, a casual game that turns document tagging into an individual activity of (metaphorically) weeding a garden and protecting plants from gophers and includes a reward system that encourages orthogonal tag usage. The paper concludes with lessons learned and summarizes remaining challenges.},
+ address = {Amsterdam, Netherlands},
+ author = {Carlos Maltzahn and Arnav Jhala and Michael Mateas and Jim Whitehead},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAYTS9tYWx0emFobi1nYW1pZmlyMTQucGRmTxEBeAAAAAABeAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////Fm1hbHR6YWhuLWdhbWlmaXIxNC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFNAAACAD4vOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOk06bWFsdHphaG4tZ2FtaWZpcjE0LnBkZgAOAC4AFgBtAGEAbAB0AHoAYQBoAG4ALQBnAGEAbQBpAGYAaQByADEANAAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAKS9NeSBEcml2ZS9QYXBlcnMvTS9tYWx0emFobi1nYW1pZmlyMTQucGRmAAATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA/AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbs=},
+ booktitle = {GamifIR'14 at ECIR'14},
+ date-added = {2014-04-22 01:27:12 +0000},
+ date-modified = {2020-01-04 21:59:05 -0700},
+ keywords = {papers, gamification, games, archive, digitalpreservation, tagging},
+ month = {April 13},
+ title = {Gamification of Private Digital Data Archive Management},
+ year = {2014}
+}
+
diff --git a/content/publication/maltzahn-gamifir-14/index.md b/content/publication/maltzahn-gamifir-14/index.md
new file mode 100644
index 00000000000..155a7192b34
--- /dev/null
+++ b/content/publication/maltzahn-gamifir-14/index.md
@@ -0,0 +1,12 @@
+---
+title: "Gamification of Private Digital Data Archive Management"
+date: 2014-04-01
+publishDate: 2020-01-05T06:43:50.509804Z
+authors: ["Carlos Maltzahn", "Arnav Jhala", "Michael Mateas", "Jim Whitehead"]
+publication_types: ["1"]
+abstract: "The super-exponential growth of digital data world-wide is matched by personal digital archives containing songs, ebooks, audio books, photos, movies, textual documents, and documents of other media types. For many types of media it is usually a lot easier to add items than to keep archives from falling into disarray and incurring data loss. The overhead of maintaining these personal archives frequently surpasses the time and patience their owners are willing to dedicate to this important task. The promise of gamification in this context is to significantly extend the willingness to maintain personal archives by enhancing the experience of personal archive management. In this paper we focus on a subcategory of personal archives which we call private archives. These are archives that for a variety of reasons the owner does not want to make available online and which consequently limits archive maintenance to an individual activity and does not allow any form of crowdsourcing out of fear for unwanted information leaks. As an example of private digital archive maintenance gamification we describe InfoGarden, a casual game that turns document tagging into an individual activity of (metaphorically) weeding a garden and protecting plants from gophers and includes a reward system that encourages orthogonal tag usage. The paper concludes with lessons learned and summarizes remaining challenges."
+featured: false
+publication: "*GamifIR'14 at ECIR'14*"
+tags: ["papers", "gamification", "games", "archive", "digitalpreservation", "tagging"]
+---
+
diff --git a/content/publication/maltzahn-hotstorage-18-breakout/cite.bib b/content/publication/maltzahn-hotstorage-18-breakout/cite.bib
new file mode 100644
index 00000000000..ce37c9eafa1
--- /dev/null
+++ b/content/publication/maltzahn-hotstorage-18-breakout/cite.bib
@@ -0,0 +1,15 @@
+@inproceedings{maltzahn:hotstorage18-breakout,
+ address = {Boston, MA},
+ author = {Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAkTS9tYWx0emFobi1ob3RzdG9yYWdlMTgtYnJlYWtvdXQucGRmTxEBqAAAAAABqAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////H21hbHR6YWhuLWhvdHN0b3JhZyNGRkZGRkZGRi5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFNAAACAEovOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOk06bWFsdHphaG4taG90c3RvcmFnZTE4LWJyZWFrb3V0LnBkZgAOAEYAIgBtAGEAbAB0AHoAYQBoAG4ALQBoAG8AdABzAHQAbwByAGEAZwBlADEAOAAtAGIAcgBlAGEAawBvAHUAdAAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIANS9NeSBEcml2ZS9QYXBlcnMvTS9tYWx0emFobi1ob3RzdG9yYWdlMTgtYnJlYWtvdXQucGRmAAATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJABLAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAfc=},
+ bdsk-url-1 = {https://docs.google.com/presentation/d/1yvXWpxfNWZ4NIL9GLLWM_e3TAm-8Mu-EfAygo1SRRlg/edit?usp=sharing},
+ bdsk-url-2 = {https://docs.google.com/document/d/1Vfuoy2H8Mg2PrweO5I2sP04gAZonhUIxE3_W9oMFhwI/edit?usp=sharing},
+ booktitle = {Breakouts Session abstract at 10th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage'18, co-located with USENIX ATC'18)},
+ date-added = {2019-12-26 19:10:01 -0800},
+ date-modified = {2019-12-29 16:35:30 -0800},
+ keywords = {shortpapers, storage, embedded, eusocial},
+ month = {July 9-10},
+ title = {Should Storage Devices Stay Dumb or Become Smart?},
+ year = {2018}
+}
+
diff --git a/content/publication/maltzahn-hotstorage-18-breakout/index.md b/content/publication/maltzahn-hotstorage-18-breakout/index.md
new file mode 100644
index 00000000000..1c54433db57
--- /dev/null
+++ b/content/publication/maltzahn-hotstorage-18-breakout/index.md
@@ -0,0 +1,16 @@
+---
+title: "Should Storage Devices Stay Dumb or Become Smart?"
+date: 2018-07-01
+publishDate: 2020-01-05T06:43:50.389702Z
+authors: ["Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*Breakouts Session abstract at 10th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage'18, co-located with USENIX ATC'18)*"
+tags: ["shortpapers", "storage", "embedded", "eusocial"]
+projects:
+- declstore
+- programmable-storage
+- eusocial-storage
+---
+
diff --git a/content/publication/maltzahn-login-10/cite.bib b/content/publication/maltzahn-login-10/cite.bib
new file mode 100644
index 00000000000..59a7bda61cf
--- /dev/null
+++ b/content/publication/maltzahn-login-10/cite.bib
@@ -0,0 +1,14 @@
+@article{maltzahn:login10,
+ abstract = {The Hadoop Distributed File System (HDFS) has a single metadata server that sets a hard limit on its maximum size. Ceph, a high-performance distributed file system under development since 2005 and now supported in Linux, bypasses the scaling limits of HDFS. We describe Ceph and its elements and provide instructions for installing a demonstration system that can be used with Hadoop.},
+ author = {Carlos Maltzahn and Esteban Molina-Estolano and Amandeep Khurana and Alex J. Nelson and Scott A. Brandt and Sage A. Weil},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAWTS9tYWx0emFobi1sb2dpbjEwLnBkZk8RAXAAAAAAAXAAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xRtYWx0emFobi1sb2dpbjEwLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABTQAAAgA8LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpNOm1hbHR6YWhuLWxvZ2luMTAucGRmAA4AKgAUAG0AYQBsAHQAegBhAGgAbgAtAGwAbwBnAGkAbgAxADAALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACcvTXkgRHJpdmUvUGFwZXJzL00vbWFsdHphaG4tbG9naW4xMC5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAD0AAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABsQ==},
+ date-added = {2010-09-30 15:19:48 -0700},
+ date-modified = {2020-01-05 05:43:26 -0700},
+ journal = {;login: The USENIX Magazine},
+ keywords = {papers, filesystems, parallel, hadoop, mapreduce, storage},
+ number = {4},
+ title = {Ceph as a Scalable Alternative to the Hadoop Distributed File System},
+ volume = {35},
+ year = {2010}
+}
+
diff --git a/content/publication/maltzahn-login-10/index.md b/content/publication/maltzahn-login-10/index.md
new file mode 100644
index 00000000000..7a14183f4e0
--- /dev/null
+++ b/content/publication/maltzahn-login-10/index.md
@@ -0,0 +1,12 @@
+---
+title: "Ceph as a Scalable Alternative to the Hadoop Distributed File System"
+date: 2010-01-01
+publishDate: 2020-01-05T13:33:05.987677Z
+authors: ["Carlos Maltzahn", "Esteban Molina-Estolano", "Amandeep Khurana", "Alex J. Nelson", "Scott A. Brandt", "Sage A. Weil"]
+publication_types: ["2"]
+abstract: "The Hadoop Distributed File System (HDFS) has a single metadata server that sets a hard limit on its maximum size. Ceph, a high-performance distributed file system under development since 2005 and now supported in Linux, bypasses the scaling limits of HDFS. We describe Ceph and its elements and provide instructions for installing a demonstration system that can be used with Hadoop."
+featured: false
+publication: "*;login: The USENIX Magazine*"
+tags: ["papers", "filesystems", "parallel", "hadoop", "mapreduce", "storage"]
+---
+
diff --git a/content/publication/maltzahn-per-97/cite.bib b/content/publication/maltzahn-per-97/cite.bib
new file mode 100644
index 00000000000..a0f72a8d517
--- /dev/null
+++ b/content/publication/maltzahn-per-97/cite.bib
@@ -0,0 +1,16 @@
+@article{maltzahn:per97,
+ abstract = {Enterprise level web proxies relay world-wide web traffic between private networks and the Internet. They improve security, save network bandwidth, and reduce network latency. While the performance of web proxies has been analyzed based on synthetic workloads, little is known about their performance on real workloads. In this paper we present a study of two web proxies (CERN and Squid) executing real workloads on Digital's Palo Alto Gateway. We demonstrate that the simple CERN proxy architecture outperforms all but the latest version of Squid and continues to outperform cacheless configurations. For the measured load levels the Squid proxy used at least as many CPU, memory, and disk resources as CERN, in some configurations significantly more resources. At higher load levels the resource utilization requirements will cross and Squid will be the one using fewer resources. Lastly we found that cache hit rates of around 30% had very little effect on the requests service time.},
+ author = {Carlos Maltzahn and Kathy Richardson and Dirk Grunwald},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAbTS9tYWx0emFobi1zaWdtZXRyaWNzOTcucGRmTxEBhAAAAAABhAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////GW1hbHR6YWhuLXNpZ21ldHJpY3M5Ny5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFNAAACAEEvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOk06bWFsdHphaG4tc2lnbWV0cmljczk3LnBkZgAADgA0ABkAbQBhAGwAdAB6AGEAaABuAC0AcwBpAGcAbQBlAHQAcgBpAGMAcwA5ADcALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACwvTXkgRHJpdmUvUGFwZXJzL00vbWFsdHphaG4tc2lnbWV0cmljczk3LnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJABCAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAco=},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:18:29 -0700},
+ journal = {ACM SIGMETRICS Performance Evaluation Review},
+ keywords = {papers, performance, webcaching, networking, intermediary},
+ month = {June},
+ number = {1},
+ pages = {13-23},
+ title = {Performance Issues of Enterprise Level Web Proxies},
+ volume = {25},
+ year = {1997}
+}
+
diff --git a/content/publication/maltzahn-per-97/index.md b/content/publication/maltzahn-per-97/index.md
new file mode 100644
index 00000000000..ab7ea297d04
--- /dev/null
+++ b/content/publication/maltzahn-per-97/index.md
@@ -0,0 +1,12 @@
+---
+title: "Performance Issues of Enterprise Level Web Proxies"
+date: 1997-06-01
+publishDate: 2020-01-05T13:33:06.012417Z
+authors: ["Carlos Maltzahn", "Kathy Richardson", "Dirk Grunwald"]
+publication_types: ["2"]
+abstract: "Enterprise level web proxies relay world-wide web traffic between private networks and the Internet. They improve security, save network bandwidth, and reduce network latency. While the performance of web proxies has been analyzed based on synthetic workloads, little is known about their performance on real workloads. In this paper we present a study of two web proxies (CERN and Squid) executing real workloads on Digital's Palo Alto Gateway. We demonstrate that the simple CERN proxy architecture outperforms all but the latest version of Squid and continues to outperform cacheless configurations. For the measured load levels the Squid proxy used at least as many CPU, memory, and disk resources as CERN, in some configurations significantly more resources. At higher load levels the resource utilization requirements will cross and Squid will be the one using fewer resources. Lastly we found that cache hit rates of around 30% had very little effect on the requests service time."
+featured: false
+publication: "*ACM SIGMETRICS Performance Evaluation Review*"
+tags: ["papers", "performance", "webcaching", "networking", "intermediary"]
+---
+
diff --git a/content/publication/maltzahn-phdthesis-99/cite.bib b/content/publication/maltzahn-phdthesis-99/cite.bib
new file mode 100644
index 00000000000..13b8cf1fa2c
--- /dev/null
+++ b/content/publication/maltzahn-phdthesis-99/cite.bib
@@ -0,0 +1,15 @@
+@phdthesis{maltzahn:phdthesis99,
+ abstract = {The resource utilization of enterprise-level Web proxy servers is primarily dependent on network and disk I/O latencies and is highly variable due to a diurnal workload pattern with very predictable peak and off-peak periods. Often, the cost of resources depends on the purchased resource capacity instead of the actual utilization. This motivates the use of off-peak periods to perform speculative work in the hope that this work will later reduce resource utilization during peak periods. We take two approaches to improve resource utilization.
+In the first approach we reduce disk I/O by cache compaction during off-peak periods and by carefully designing the way a cache architecture utilizes operating system services such as the file system buffer cache and the virtual memory system. Evaluating our designs with workload generators on standard file systems we achieve disk I/O savings of over 70% compared to existing Web proxy server architectures.
+In the second approach we reduce peak bandwidth levels by prefetching bandwidth dur- ing off-peak periods. Our analysis reveals that 40% of the cacheable miss bandwidth is prefetch- able. We found that 99% of this prefetchable bandwidth is based on objects that the Web proxy server under study has not accessed before. However, these objects originate from servers which the Web proxy server under study has accessed before. Using machine learning techniques we are able to automatically generate prefetch strategies of high accuracy and medium coverage. A test of these prefetch strategies on real workloads achieves a peak-level reduction of up to 12%.},
+ address = {Boulder, Co},
+ author = {Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAaTS9tYWx0emFobi1waGR0aGVzaXM5OS5wZGZPEQGAAAAAAAGAAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8YbWFsdHphaG4tcGhkdGhlc2lzOTkucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAU0AAAIAQC86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6TTptYWx0emFobi1waGR0aGVzaXM5OS5wZGYADgAyABgAbQBhAGwAdAB6AGEAaABuAC0AcABoAGQAdABoAGUAcwBpAHMAOQA5AC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgArL015IERyaXZlL1BhcGVycy9NL21hbHR6YWhuLXBoZHRoZXNpczk5LnBkZgAAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAQQAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHF},
+ date-added = {2012-12-07 22:22:06 +0000},
+ date-modified = {2020-01-05 05:26:52 -0700},
+ keywords = {papers, prefetching, networking, intermediary, caching, performance, machinelearning},
+ school = {University of Colorado at Boulder},
+ title = {Improving Resource Utilization of Enterprise-Level World-Wide Web Proxy Servers},
+ year = {1999}
+}
+
diff --git a/content/publication/maltzahn-phdthesis-99/index.md b/content/publication/maltzahn-phdthesis-99/index.md
new file mode 100644
index 00000000000..6d332fe7c54
--- /dev/null
+++ b/content/publication/maltzahn-phdthesis-99/index.md
@@ -0,0 +1,12 @@
+---
+title: "Improving Resource Utilization of Enterprise-Level World-Wide Web Proxy Servers"
+date: 1999-01-01
+publishDate: 2020-01-05T12:39:43.035343Z
+authors: ["Carlos Maltzahn"]
+publication_types: ["7"]
+abstract: "The resource utilization of enterprise-level Web proxy servers is primarily dependent on network and disk I/O latencies and is highly variable due to a diurnal workload pattern with very predictable peak and off-peak periods. Often, the cost of resources depends on the purchased resource capacity instead of the actual utilization. This motivates the use of off-peak periods to perform speculative work in the hope that this work will later reduce resource utilization during peak periods. We take two approaches to improve resource utilization. In the first approach we reduce disk I/O by cache compaction during off-peak periods and by carefully designing the way a cache architecture utilizes operating system services such as the file system buffer cache and the virtual memory system. Evaluating our designs with workload generators on standard file systems we achieve disk I/O savings of over 70% compared to existing Web proxy server architectures. In the second approach we reduce peak bandwidth levels by prefetching bandwidth dur- ing off-peak periods. Our analysis reveals that 40% of the cacheable miss bandwidth is prefetch- able. We found that 99% of this prefetchable bandwidth is based on objects that the Web proxy server under study has not accessed before. However, these objects originate from servers which the Web proxy server under study has accessed before. Using machine learning techniques we are able to automatically generate prefetch strategies of high accuracy and medium coverage. A test of these prefetch strategies on real workloads achieves a peak-level reduction of up to 12%."
+featured: false
+publication: ""
+tags: ["papers", "prefetching", "networking", "intermediary", "caching", "performance", "machinelearning"]
+---
+
diff --git a/content/publication/maltzahn-si-2-ws-poster-16/cite.bib b/content/publication/maltzahn-si-2-ws-poster-16/cite.bib
new file mode 100644
index 00000000000..c8420cf7c1c
--- /dev/null
+++ b/content/publication/maltzahn-si-2-ws-poster-16/cite.bib
@@ -0,0 +1,12 @@
+@unpublished{maltzahn:si2ws-poster16,
+ author = {Carlos Maltzahn and others},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAdTS9tYWx0emFobi1zaTJ3cy1wb3N0ZXIxNi5wZGZPEQGMAAAAAAGMAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8bbWFsdHphaG4tc2kyd3MtcG9zdGVyMTYucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAU0AAAIAQy86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6TTptYWx0emFobi1zaTJ3cy1wb3N0ZXIxNi5wZGYAAA4AOAAbAG0AYQBsAHQAegBhAGgAbgAtAHMAaQAyAHcAcwAtAHAAbwBzAHQAZQByADEANgAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIALi9NeSBEcml2ZS9QYXBlcnMvTS9tYWx0emFobi1zaTJ3cy1wb3N0ZXIxNi5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQARAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHU},
+ date-added = {2016-08-18 06:04:41 +0000},
+ date-modified = {2020-01-04 21:49:20 -0700},
+ keywords = {shortpapers, overview, bigdata, reproducibility},
+ month = {February 16},
+ note = {Poster at SI2 Workshop},
+ title = {Big Weather Web: A common and sustainable big data infrastructure in support of weather prediction research and education in universities},
+ year = {2016}
+}
+
diff --git a/content/publication/maltzahn-si-2-ws-poster-16/index.md b/content/publication/maltzahn-si-2-ws-poster-16/index.md
new file mode 100644
index 00000000000..6e41daf0545
--- /dev/null
+++ b/content/publication/maltzahn-si-2-ws-poster-16/index.md
@@ -0,0 +1,12 @@
+---
+title: "Big Weather Web: A common and sustainable big data infrastructure in support of weather prediction research and education in universities"
+date: 2016-02-01
+publishDate: 2020-01-05T06:43:50.462303Z
+authors: ["Carlos Maltzahn", " others"]
+publication_types: ["3"]
+abstract: ""
+featured: false
+publication: ""
+tags: ["shortpapers", "overview", "bigdata", "reproducibility"]
+---
+
diff --git a/content/publication/maltzahn-sigmetrics-97/cite.bib b/content/publication/maltzahn-sigmetrics-97/cite.bib
new file mode 100644
index 00000000000..b075d54a6e6
--- /dev/null
+++ b/content/publication/maltzahn-sigmetrics-97/cite.bib
@@ -0,0 +1,16 @@
+@inproceedings{maltzahn:sigmetrics97,
+ abstract = {Enterprise level web proxies relay world-wide web traffic between private networks and the Internet. They improve security, save network bandwidth, and reduce network latency. While the performance of web proxies has been analyzed based on synthetic workloads, little is known about their performance on real workloads. In this paper we present a study of two web proxies (CERN and Squid) executing real workloads on Digital's Palo Alto Gateway. We demonstrate that the simple CERN proxy architecture outperforms all but the latest version of Squid and continues to outperform cacheless configurations. For the measured load levels the Squid proxy used at least as many CPU, memory, and disk resources as CERN, in some configurations significantly more resources. At higher load levels the resource utilization requirements will cross and Squid will be the one using fewer resources. Lastly we found that cache hit rates of around 30% had very little effect on the requests service time.},
+ address = {Seattle, WA},
+ author = {Carlos Maltzahn and Kathy Richardson and Dirk Grunwald},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAbTS9tYWx0emFobi1zaWdtZXRyaWNzOTcucGRmTxEBhAAAAAABhAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////GW1hbHR6YWhuLXNpZ21ldHJpY3M5Ny5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFNAAACAEEvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOk06bWFsdHphaG4tc2lnbWV0cmljczk3LnBkZgAADgA0ABkAbQBhAGwAdAB6AGEAaABuAC0AcwBpAGcAbQBlAHQAcgBpAGMAcwA5ADcALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACwvTXkgRHJpdmUvUGFwZXJzL00vbWFsdHphaG4tc2lnbWV0cmljczk3LnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJABCAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAco=},
+ booktitle = {SIGMETRICS 1997},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:19:28 -0700},
+ keywords = {papers, performance, tracing, networking, intermediary, webcaching},
+ month = {June 15-18},
+ pages = {13--23},
+ read = {Yes},
+ title = {Performance Issues of Enterprise Level Web Proxies},
+ year = {1997}
+}
+
diff --git a/content/publication/maltzahn-sigmetrics-97/index.md b/content/publication/maltzahn-sigmetrics-97/index.md
new file mode 100644
index 00000000000..26248146214
--- /dev/null
+++ b/content/publication/maltzahn-sigmetrics-97/index.md
@@ -0,0 +1,12 @@
+---
+title: "Performance Issues of Enterprise Level Web Proxies"
+date: 1997-06-01
+publishDate: 2020-01-05T13:33:06.013234Z
+authors: ["Carlos Maltzahn", "Kathy Richardson", "Dirk Grunwald"]
+publication_types: ["1"]
+abstract: "Enterprise level web proxies relay world-wide web traffic between private networks and the Internet. They improve security, save network bandwidth, and reduce network latency. While the performance of web proxies has been analyzed based on synthetic workloads, little is known about their performance on real workloads. In this paper we present a study of two web proxies (CERN and Squid) executing real workloads on Digital's Palo Alto Gateway. We demonstrate that the simple CERN proxy architecture outperforms all but the latest version of Squid and continues to outperform cacheless configurations. For the measured load levels the Squid proxy used at least as many CPU, memory, and disk resources as CERN, in some configurations significantly more resources. At higher load levels the resource utilization requirements will cross and Squid will be the one using fewer resources. Lastly we found that cache hit rates of around 30% had very little effect on the requests service time."
+featured: false
+publication: "*SIGMETRICS 1997*"
+tags: ["papers", "performance", "tracing", "networking", "intermediary", "webcaching"]
+---
+
diff --git a/content/publication/maltzahn-usenix-99/cite.bib b/content/publication/maltzahn-usenix-99/cite.bib
new file mode 100644
index 00000000000..a33fe3a8a1f
--- /dev/null
+++ b/content/publication/maltzahn-usenix-99/cite.bib
@@ -0,0 +1,15 @@
+@inproceedings{maltzahn:usenix99,
+ abstract = {The dramatic increase of HTTP traffic on the Internet has resulted in wide-spread use of large caching proxy servers as critical Internet infrastructure components. With continued growth the demand for larger caches and higher performance proxies grows as well. The common bottleneck of large caching proxy servers is disk I/O. In this paper we evaluate ways to reduce the amount of required disk I/O. First we compare the file system interactions of two existing web proxy servers, CERN and SQUID. Then we show how design adjustments to the current SQUID cache architecture can dramatically reduce disk I/O. Our findings suggest two that strategies can significantly reduce disk I/O: (1) preserve locality of the HTTP reference stream while translating these references into cache references, and (2) use virtual memory instead of the file system for objects smaller than the system page size. The evaluated techniques reduced disk I/O by 50% to 70%.},
+ address = {Monterey, CA},
+ author = {Carlos Maltzahn and Kathy Richardson and Dirk Grunwald},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAXTS9tYWx0emFobi11c2VuaXg5OS5wZGZPEQF0AAAAAAF0AAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8VbWFsdHphaG4tdXNlbml4OTkucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAU0AAAIAPS86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6TTptYWx0emFobi11c2VuaXg5OS5wZGYAAA4ALAAVAG0AYQBsAHQAegBhAGgAbgAtAHUAcwBlAG4AaQB4ADkAOQAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAKC9NeSBEcml2ZS9QYXBlcnMvTS9tYWx0emFobi11c2VuaXg5OS5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAPgAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAG2},
+ booktitle = {USENIX ATC '99},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:20:58 -0700},
+ keywords = {papers, networking, intermediary, storage, webcaching},
+ month = {June 6-11},
+ read = {Yes},
+ title = {Reducing the Disk I/O of Web Proxy Server Caches},
+ year = {1999}
+}
+
diff --git a/content/publication/maltzahn-usenix-99/index.md b/content/publication/maltzahn-usenix-99/index.md
new file mode 100644
index 00000000000..daa6bdcf638
--- /dev/null
+++ b/content/publication/maltzahn-usenix-99/index.md
@@ -0,0 +1,12 @@
+---
+title: "Reducing the Disk I/O of Web Proxy Server Caches"
+date: 1999-06-01
+publishDate: 2020-01-05T13:33:06.016512Z
+authors: ["Carlos Maltzahn", "Kathy Richardson", "Dirk Grunwald"]
+publication_types: ["1"]
+abstract: "The dramatic increase of HTTP traffic on the Internet has resulted in wide-spread use of large caching proxy servers as critical Internet infrastructure components. With continued growth the demand for larger caches and higher performance proxies grows as well. The common bottleneck of large caching proxy servers is disk I/O. In this paper we evaluate ways to reduce the amount of required disk I/O. First we compare the file system interactions of two existing web proxy servers, CERN and SQUID. Then we show how design adjustments to the current SQUID cache architecture can dramatically reduce disk I/O. Our findings suggest two that strategies can significantly reduce disk I/O: (1) preserve locality of the HTTP reference stream while translating these references into cache references, and (2) use virtual memory instead of the file system for objects smaller than the system page size. The evaluated techniques reduced disk I/O by 50% to 70%."
+featured: false
+publication: "*USENIX ATC '99*"
+tags: ["papers", "networking", "intermediary", "storage", "webcaching"]
+---
+
diff --git a/content/publication/maltzahn-vkika-91/cite.bib b/content/publication/maltzahn-vkika-91/cite.bib
new file mode 100644
index 00000000000..3c1638b8d77
--- /dev/null
+++ b/content/publication/maltzahn-vkika-91/cite.bib
@@ -0,0 +1,16 @@
+@inproceedings{maltzahn:vkika91,
+ abstract = {Die meisten CAD-Umgebungen betonen die Unterstatzung einzelner Arbeitspliitze und helfen nur sekundiir bei deren Kooperation. Wir schlagen einen umgekehrten Ansatz vor: Entwiirfe entstehen im Rahmen von interagierenden Sharing-Prozessen, die den gemeinsamen Zugang aller Beteiligten zu Konzepten, Aufgaben und Ergebnissen strukturieren. Dieser Ansatz und seine Konsequenzen werden am Beispiel des Software Engineering dargestellt. Aufder Basis einer Formalisierung dieser Prozesse steuert der ConceptTalk-Prototyp eine verteilte Softwareumgebung und speziel/e Kommunikationswerkzeuge aber das Wissensbanksystem ConceptBase. Erfahrungen mit ConceptTalk unterstatzen ein neues Paradigma, das ein Informationssystem als Medium for komplexe Kommunikation betrachtet.},
+ author = {Carlos Maltzahn and Thomas Rose},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAWTS9tYWx0emFobi12a2lrYTkxLnBkZk8RAXAAAAAAAXAAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xRtYWx0emFobi12a2lrYTkxLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABTQAAAgA8LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpNOm1hbHR6YWhuLXZraWthOTEucGRmAA4AKgAUAG0AYQBsAHQAegBhAGgAbgAtAHYAawBpAGsAYQA5ADEALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACcvTXkgRHJpdmUvUGFwZXJzL00vbWFsdHphaG4tdmtpa2E5MS5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAD0AAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABsQ==},
+ booktitle = {Verteilte Künstliche Intelligenz und kooperatives Arbeiten},
+ date-added = {2019-12-26 18:32:03 -0800},
+ date-modified = {2020-01-04 21:16:07 -0700},
+ editor = {W. Brauer and D. Hernández},
+ keywords = {papers, cscw, softwareengineering},
+ pages = {195--206},
+ publisher = {Springer-Verlag Berlin Heidelberg},
+ title = {ConceptTalk: Kooperationsunterstützung in Softwareumgebungen},
+ volume = {291},
+ year = {1991}
+}
+
diff --git a/content/publication/maltzahn-vkika-91/index.md b/content/publication/maltzahn-vkika-91/index.md
new file mode 100644
index 00000000000..940cf198744
--- /dev/null
+++ b/content/publication/maltzahn-vkika-91/index.md
@@ -0,0 +1,12 @@
+---
+title: "ConceptTalk: Kooperationsunterstützung in Softwareumgebungen"
+date: 1991-01-01
+publishDate: 2020-01-05T06:43:50.398096Z
+authors: ["Carlos Maltzahn", "Thomas Rose"]
+publication_types: ["1"]
+abstract: "Die meisten CAD-Umgebungen betonen die Unterstatzung einzelner Arbeitspliitze und helfen nur sekundiir bei deren Kooperation. Wir schlagen einen umgekehrten Ansatz vor: Entwiirfe entstehen im Rahmen von interagierenden Sharing-Prozessen, die den gemeinsamen Zugang aller Beteiligten zu Konzepten, Aufgaben und Ergebnissen strukturieren. Dieser Ansatz und seine Konsequenzen werden am Beispiel des Software Engineering dargestellt. Aufder Basis einer Formalisierung dieser Prozesse steuert der ConceptTalk-Prototyp eine verteilte Softwareumgebung und speziel/e Kommunikationswerkzeuge aber das Wissensbanksystem ConceptBase. Erfahrungen mit ConceptTalk unterstatzen ein neues Paradigma, das ein Informationssystem als Medium for komplexe Kommunikation betrachtet."
+featured: false
+publication: "*Verteilte Künstliche Intelligenz und kooperatives Arbeiten*"
+tags: ["papers", "cscw", "softwareengineering"]
+---
+
diff --git a/content/publication/maltzahn-wcw-99/cite.bib b/content/publication/maltzahn-wcw-99/cite.bib
new file mode 100644
index 00000000000..26d3121c924
--- /dev/null
+++ b/content/publication/maltzahn-wcw-99/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{maltzahn:wcw99,
+ abstract = {The bandwidth usage due to HTTP traffic often varies considerably over the course of a day, requiring high network performance during peak periods while leaving network resources unused during off-peak periods. We show that using these extra network resources to prefetch web content during off-peak periods can significantly reduce peak bandwidth usage without compromising cache consistency. With large HTTP traffic variations it is therefore feasible to apply ``bandwidth smoothing'' to reduce the cost and the required capacity of a network infrastructure. In addition to reducing the peak network demand, bandwidth smoothing improves cache hit rates. We apply machine learning techniques to automatically develop prefetch strategies that have high accuracy. Our results are based on web proxy traces generated at a large corporate Internet exchange point and data collected from recent scans of popular web sites},
+ address = {San Diego, CA},
+ author = {Carlos Maltzahn and Kathy Richardson and Dirk Grunwald and James Martin},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAUTS9tYWx0emFobi13Y3c5OS5wZGZPEQFoAAAAAAFoAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8SbWFsdHphaG4td2N3OTkucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAU0AAAIAOi86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6TTptYWx0emFobi13Y3c5OS5wZGYADgAmABIAbQBhAGwAdAB6AGEAaABuAC0AdwBjAHcAOQA5AC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAlL015IERyaXZlL1BhcGVycy9NL21hbHR6YWhuLXdjdzk5LnBkZgAAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAOwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGn},
+ booktitle = {4th International Web Caching Workshop (WCW'99)},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:17:30 -0700},
+ keywords = {papers, networking, intermediary, machinelearning, webcaching},
+ month = {March 31 - April 2},
+ title = {On Bandwidth Smoothing},
+ year = {1999}
+}
+
diff --git a/content/publication/maltzahn-wcw-99/index.md b/content/publication/maltzahn-wcw-99/index.md
new file mode 100644
index 00000000000..eb848e27730
--- /dev/null
+++ b/content/publication/maltzahn-wcw-99/index.md
@@ -0,0 +1,12 @@
+---
+title: "On Bandwidth Smoothing"
+date: 1999-03-01
+publishDate: 2020-01-05T13:33:06.011348Z
+authors: ["Carlos Maltzahn", "Kathy Richardson", "Dirk Grunwald", "James Martin"]
+publication_types: ["1"]
+abstract: "The bandwidth usage due to HTTP traffic often varies considerably over the course of a day, requiring high network performance during peak periods while leaving network resources unused during off-peak periods. We show that using these extra network resources to prefetch web content during off-peak periods can significantly reduce peak bandwidth usage without compromising cache consistency. With large HTTP traffic variations it is therefore feasible to apply ``bandwidth smoothing'' to reduce the cost and the required capacity of a network infrastructure. In addition to reducing the peak network demand, bandwidth smoothing improves cache hit rates. We apply machine learning techniques to automatically develop prefetch strategies that have high accuracy. Our results are based on web proxy traces generated at a large corporate Internet exchange point and data collected from recent scans of popular web sites"
+featured: false
+publication: "*4th International Web Caching Workshop (WCW'99)*"
+tags: ["papers", "networking", "intermediary", "machinelearning", "webcaching"]
+---
+
diff --git a/content/publication/manzanares-hotstorage-16/cite.bib b/content/publication/manzanares-hotstorage-16/cite.bib
new file mode 100644
index 00000000000..8e1eb282832
--- /dev/null
+++ b/content/publication/manzanares-hotstorage-16/cite.bib
@@ -0,0 +1,13 @@
+@inproceedings{manzanares:hotstorage16,
+ address = {Denver, CO},
+ author = {Manzanares and Noah Watkins and Cyril Guyot and Damien LeMoal and Carlos Maltzahn and Zvonimir Bandic},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAdTS9tYW56YW5hcmVzLWhvdHN0b3JhZ2UxNi5wZGZPEQGMAAAAAAGMAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8bbWFuemFuYXJlcy1ob3RzdG9yYWdlMTYucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAU0AAAIAQy86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6TTptYW56YW5hcmVzLWhvdHN0b3JhZ2UxNi5wZGYAAA4AOAAbAG0AYQBuAHoAYQBuAGEAcgBlAHMALQBoAG8AdABzAHQAbwByAGEAZwBlADEANgAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIALi9NeSBEcml2ZS9QYXBlcnMvTS9tYW56YW5hcmVzLWhvdHN0b3JhZ2UxNi5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQARAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHU},
+ booktitle = {HotStorage '16},
+ date-added = {2016-05-17 21:34:02 +0000},
+ date-modified = {2016-05-17 21:36:35 +0000},
+ keywords = {papers, storagemedium, shingledrecording, os, allocation},
+ month = {June 20-21},
+ title = {ZEA, A Data Management Approach for SMR},
+ year = {2016}
+}
+
diff --git a/content/publication/manzanares-hotstorage-16/index.md b/content/publication/manzanares-hotstorage-16/index.md
new file mode 100644
index 00000000000..b38caf0e0d9
--- /dev/null
+++ b/content/publication/manzanares-hotstorage-16/index.md
@@ -0,0 +1,12 @@
+---
+title: "ZEA, A Data Management Approach for SMR"
+date: 2016-06-01
+publishDate: 2020-01-05T06:43:50.467381Z
+authors: [" Manzanares", "Noah Watkins", "Cyril Guyot", "Damien LeMoal", "Carlos Maltzahn", "Zvonimir Bandic"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*HotStorage '16*"
+tags: ["papers", "storagemedium", "shingledrecording", "os", "allocation"]
+---
+
diff --git a/content/publication/maricq-osdi-18/cite.bib b/content/publication/maricq-osdi-18/cite.bib
new file mode 100644
index 00000000000..e55cd8e9fbe
--- /dev/null
+++ b/content/publication/maricq-osdi-18/cite.bib
@@ -0,0 +1,15 @@
+@inproceedings{maricq:osdi18,
+ abstract = {The performance of compute hardware varies: software run repeatedly on the same server (or a different server with supposedly identical parts) can produce performance results that differ with each execution. This variation has important effects on the reproducibility of systems research and ability to quantitatively compare the performance of different systems. It also has implications for commercial computing, where agreements are often made conditioned on meeting specific performance targets.
+Over a period of 10 months, we conducted a large-scale study capturing nearly 900,000 data points from 835 servers. We examine this data from two perspectives: that of a service provider wishing to offer a consistent environment, and that of a systems researcher who must understand how variability impacts experimental results. From this examination, we draw a number of lessons about the types and magnitudes of performance variability and the effects on confidence in experiment results. We also create a statistical model that can be used to understand how representative an individual server is of the general population. The full dataset and our analysis tools are publicly available, and we have built a system to interactively explore the data and make recommendations for experiment parameters based on statistical analysis of historical data.},
+ address = {Carlsbad, CA},
+ author = {Aleksander Maricq and Dmitry Duplyakin and Ivo Jimenez and Carlos Maltzahn and Ryan Stutsman and Robert Ricci},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxATTS9tYXJpY3Etb3NkaTE4LnBkZk8RAWQAAAAAAWQAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xFtYXJpY3Etb3NkaTE4LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABTQAAAgA5LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpNOm1hcmljcS1vc2RpMTgucGRmAAAOACQAEQBtAGEAcgBpAGMAcQAtAG8AcwBkAGkAMQA4AC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAkL015IERyaXZlL1BhcGVycy9NL21hcmljcS1vc2RpMTgucGRmABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkADoAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABog==},
+ booktitle = {13th USENIX Symposium on Operating Systems Design and Implementation (OSDI'18)},
+ date-added = {2018-07-21 02:10:24 +0000},
+ date-modified = {2020-01-04 21:30:49 -0700},
+ keywords = {papers, performance, statistics, cloud, reproducibility, systems},
+ month = {October 8-10},
+ title = {Taming Performance Variability},
+ year = {2018}
+}
+
diff --git a/content/publication/maricq-osdi-18/index.md b/content/publication/maricq-osdi-18/index.md
new file mode 100644
index 00000000000..48d135a2c79
--- /dev/null
+++ b/content/publication/maricq-osdi-18/index.md
@@ -0,0 +1,15 @@
+---
+title: "Taming Performance Variability"
+date: 2018-10-01
+publishDate: 2020-01-05T06:43:50.428157Z
+authors: ["Aleksander Maricq", "Dmitry Duplyakin", "Ivo Jimenez", "Carlos Maltzahn", "Ryan Stutsman", "Robert Ricci"]
+publication_types: ["1"]
+abstract: "The performance of compute hardware varies: software run repeatedly on the same server (or a different server with supposedly identical parts) can produce performance results that differ with each execution. This variation has important effects on the reproducibility of systems research and ability to quantitatively compare the performance of different systems. It also has implications for commercial computing, where agreements are often made conditioned on meeting specific performance targets. Over a period of 10 months, we conducted a large-scale study capturing nearly 900,000 data points from 835 servers. We examine this data from two perspectives: that of a service provider wishing to offer a consistent environment, and that of a systems researcher who must understand how variability impacts experimental results. From this examination, we draw a number of lessons about the types and magnitudes of performance variability and the effects on confidence in experiment results. We also create a statistical model that can be used to understand how representative an individual server is of the general population. The full dataset and our analysis tools are publicly available, and we have built a system to interactively explore the data and make recommendations for experiment parameters based on statistical analysis of historical data."
+featured: false
+url_slides: https://docs.google.com/presentation/d/1cFQ3jNLC2WsW8eqV74lX89ZMzGkIUuXwQKD1b4w81L8/edit#slide=id.p
+publication: "*13th USENIX Symposium on Operating Systems Design and Implementation (OSDI'18)*"
+tags: ["papers", "performance", "statistics", "cloud", "reproducibility", "systems"]
+projects:
+- practical-reproducibility
+---
+
diff --git a/content/publication/mowat-netapp-07/cite.bib b/content/publication/mowat-netapp-07/cite.bib
new file mode 100644
index 00000000000..385550b3d05
--- /dev/null
+++ b/content/publication/mowat-netapp-07/cite.bib
@@ -0,0 +1,11 @@
+@misc{mowat:netapp07,
+ author = {J. Eric Mowat and Yee-Peng Wang and Carlos Maltzahn and Raghu C. Mallena},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAUTS9tb3dhdC1uZXRhcHAwNy5wZGZPEQFoAAAAAAFoAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8SbW93YXQtbmV0YXBwMDcucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAU0AAAIAOi86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6TTptb3dhdC1uZXRhcHAwNy5wZGYADgAmABIAbQBvAHcAYQB0AC0AbgBlAHQAYQBwAHAAMAA3AC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAlL015IERyaXZlL1BhcGVycy9NL21vd2F0LW5ldGFwcDA3LnBkZgAAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAOwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGn},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2019-12-29 16:52:04 -0800},
+ keywords = {patents, caching, webcaching },
+ month = {July},
+ title = {United States Patent 7,249,219: Method and Apparatus to Improve Buffer Cache Hit Rate},
+ year = {2007}
+}
+
diff --git a/content/publication/mowat-netapp-07/index.md b/content/publication/mowat-netapp-07/index.md
new file mode 100644
index 00000000000..b864e598e04
--- /dev/null
+++ b/content/publication/mowat-netapp-07/index.md
@@ -0,0 +1,12 @@
+---
+title: "United States Patent 7,249,219: Method and Apparatus to Improve Buffer Cache Hit Rate"
+date: 2007-07-01
+publishDate: 2020-01-05T06:43:50.737123Z
+authors: ["J. Eric Mowat", "Yee-Peng Wang", "Carlos Maltzahn", "Raghu C. Mallena"]
+publication_types: ["0"]
+abstract: ""
+featured: false
+publication: ""
+tags: ["patents", "caching", "webcaching"]
+---
+
diff --git a/content/publication/nsf-repeto-22/cite.bib b/content/publication/nsf-repeto-22/cite.bib
new file mode 100644
index 00000000000..9736e7e53dc
--- /dev/null
+++ b/content/publication/nsf-repeto-22/cite.bib
@@ -0,0 +1,11 @@
+@unpublished{nsf:repeto22,
+ author = {National Science Foundation -- Office of Advanced Cyberinfrastructure (OAC)},
+ date-added = {2022-08-16 17:33:00 -0700},
+ date-modified = {2022-08-16 17:35:36 -0700},
+ keywords = {funding},
+ month = {October},
+ note = {Available at www.nsf.gov/awardsearch/showAward?AWD_ID=2226407},
+ title = {Collaborative Research: Disciplinary Improvements: Repeto: Building a Network for Practical Reproducibility in Experimental Computer Science},
+ year = {2022}
+}
+
diff --git a/content/publication/nsf-repeto-22/index.md b/content/publication/nsf-repeto-22/index.md
new file mode 100644
index 00000000000..56f205cf278
--- /dev/null
+++ b/content/publication/nsf-repeto-22/index.md
@@ -0,0 +1,37 @@
+---
+# Documentation: https://wowchemy.com/docs/managing-content/
+
+title: 'Collaborative Research: Disciplinary Improvements: Repeto: Building a Network
+ for Practical Reproducibility in Experimental Computer Science'
+subtitle: ''
+summary: ''
+authors:
+- National Science Foundation -- Office of Advanced Cyberinfrastructure (OAC)
+tags:
+- funding
+categories: []
+date: '2022-10-01'
+lastmod: 2022-08-16T18:23:04-07:00
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Focal points: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight.
+image:
+ caption: ''
+ focal_point: ''
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects: []
+publishDate: '2022-08-17T01:23:04.448417Z'
+publication_types:
+- '3'
+abstract: ''
+publication: ''
+---
diff --git a/content/publication/pineiro-rtas-11/cite.bib b/content/publication/pineiro-rtas-11/cite.bib
new file mode 100644
index 00000000000..07630b9dc06
--- /dev/null
+++ b/content/publication/pineiro-rtas-11/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{pineiro:rtas11,
+ abstract = {Real-time systems and applications are becoming increasingly complex and often comprise multiple communicating tasks. The management of the individual tasks is well-understood, but the interaction of communicating tasks with different timing characteristics is less well-understood. We discuss several representative inter-task communication flows via reserved memory buffers (possibly interconnected via a real-time network) and present RAD-Flows, a model for managing these interactions. We provide proofs and simulation results demonstrating the correctness and effectiveness of RAD-Flows, allowing system designers to determine the amount of memory required based upon the characteristics of the interacting tasks and to guarantee real-time operation of the system as a whole.},
+ address = {Chicago, IL},
+ author = {Roberto Pineiro and Kleoni Ioannidou and Carlos Maltzahn and Scott A. Brandt},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAUUC9waW5laXJvLXJ0YXMxMS5wZGZPEQFoAAAAAAFoAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8ScGluZWlyby1ydGFzMTEucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAVAAAAIAOi86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6UDpwaW5laXJvLXJ0YXMxMS5wZGYADgAmABIAcABpAG4AZQBpAHIAbwAtAHIAdABhAHMAMQAxAC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAlL015IERyaXZlL1BhcGVycy9QL3BpbmVpcm8tcnRhczExLnBkZgAAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAOwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGn},
+ booktitle = {RTAS 2011},
+ date-added = {2010-12-15 12:11:43 -0800},
+ date-modified = {2020-01-05 05:37:41 -0700},
+ keywords = {papers, memory, realtime, qos, performance, management},
+ month = {April 11-14},
+ title = {RAD-FLOWS: Buffering for Predictable Communication},
+ year = {2011}
+}
+
diff --git a/content/publication/pineiro-rtas-11/index.md b/content/publication/pineiro-rtas-11/index.md
new file mode 100644
index 00000000000..62ec88b4d8b
--- /dev/null
+++ b/content/publication/pineiro-rtas-11/index.md
@@ -0,0 +1,12 @@
+---
+title: "RAD-FLOWS: Buffering for Predictable Communication"
+date: 2011-04-01
+publishDate: 2020-01-05T13:33:05.986058Z
+authors: ["Roberto Pineiro", "Kleoni Ioannidou", "Carlos Maltzahn", "Scott A. Brandt"]
+publication_types: ["1"]
+abstract: "Real-time systems and applications are becoming increasingly complex and often comprise multiple communicating tasks. The management of the individual tasks is well-understood, but the interaction of communicating tasks with different timing characteristics is less well-understood. We discuss several representative inter-task communication flows via reserved memory buffers (possibly interconnected via a real-time network) and present RAD-Flows, a model for managing these interactions. We provide proofs and simulation results demonstrating the correctness and effectiveness of RAD-Flows, allowing system designers to determine the amount of memory required based upon the characteristics of the interacting tasks and to guarantee real-time operation of the system as a whole."
+featured: false
+publication: "*RTAS 2011*"
+tags: ["papers", "memory", "realtime", "qos", "performance", "management"]
+---
+
diff --git a/content/publication/polte-fast-10/cite.bib b/content/publication/polte-fast-10/cite.bib
new file mode 100644
index 00000000000..b5e321ad74c
--- /dev/null
+++ b/content/publication/polte-fast-10/cite.bib
@@ -0,0 +1,12 @@
+@inproceedings{polte:fast10,
+ address = {San Jose, CA},
+ author = {Milo Polte and Esteban Molina-Estolano and John Bent and Scott A. Brandt and Garth A. Gibson and Maya Gokhale and Carlos Maltzahn and Meghan Wingate},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxASUC9wb2x0ZS1mYXN0MTAucGRmTxEBYAAAAAABYAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////EHBvbHRlLWZhc3QxMC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFQAAACADgvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOlA6cG9sdGUtZmFzdDEwLnBkZgAOACIAEABwAG8AbAB0AGUALQBmAGEAcwB0ADEAMAAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAIy9NeSBEcml2ZS9QYXBlcnMvUC9wb2x0ZS1mYXN0MTAucGRmAAATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA5AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAZ0=},
+ booktitle = {Work-in-Progress and Poster Session at FAST'10},
+ date-added = {2010-03-01 16:40:17 -0800},
+ date-modified = {2010-03-01 16:46:40 -0800},
+ month = {February 24-27},
+ title = {Enabling Scientific Application I/O on Cloud FileSystems},
+ year = {2010}
+}
+
diff --git a/content/publication/polte-fast-10/index.md b/content/publication/polte-fast-10/index.md
new file mode 100644
index 00000000000..a2cb136b7d3
--- /dev/null
+++ b/content/publication/polte-fast-10/index.md
@@ -0,0 +1,11 @@
+---
+title: "Enabling Scientific Application I/O on Cloud FileSystems"
+date: 2010-02-01
+publishDate: 2020-01-05T06:43:50.600943Z
+authors: ["Milo Polte", "Esteban Molina-Estolano", "John Bent", "Scott A. Brandt", "Garth A. Gibson", "Maya Gokhale", "Carlos Maltzahn", "Meghan Wingate"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*Work-in-Progress and Poster Session at FAST'10*"
+---
+
diff --git a/content/publication/polte-pdsw-10-poster/cite.bib b/content/publication/polte-pdsw-10-poster/cite.bib
new file mode 100644
index 00000000000..e6669323e33
--- /dev/null
+++ b/content/publication/polte-pdsw-10-poster/cite.bib
@@ -0,0 +1,15 @@
+@inproceedings{polte:pdsw10poster,
+ address = {New Orleans, LA},
+ author = {Milo Polte, Esteban Molina-Estolan, John Bent and Garth Gibson and Carlos Maltzahn and Maya B. Gokhale and Scott Brandt},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAYUC9wb2x0ZS1wZHN3MTBwb3N0ZXIucGRmTxEBeAAAAAABeAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////FnBvbHRlLXBkc3cxMHBvc3Rlci5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFQAAACAD4vOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOlA6cG9sdGUtcGRzdzEwcG9zdGVyLnBkZgAOAC4AFgBwAG8AbAB0AGUALQBwAGQAcwB3ADEAMABwAG8AcwB0AGUAcgAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAKS9NeSBEcml2ZS9QYXBlcnMvUC9wb2x0ZS1wZHN3MTBwb3N0ZXIucGRmAAATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA/AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbs=},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAfUC9wb2x0ZS1wZHN3MTBwb3N0ZXItcG9zdGVyLnBkZk8RAZQAAAAAAZQAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////x1wb2x0ZS1wZHN3MTBwb3N0ZXItcG9zdGVyLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABUAAAAgBFLzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpQOnBvbHRlLXBkc3cxMHBvc3Rlci1wb3N0ZXIucGRmAAAOADwAHQBwAG8AbAB0AGUALQBwAGQAcwB3ADEAMABwAG8AcwB0AGUAcgAtAHAAbwBzAHQAZQByAC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAwL015IERyaXZlL1BhcGVycy9QL3BvbHRlLXBkc3cxMHBvc3Rlci1wb3N0ZXIucGRmABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAEYAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAAB3g==},
+ bdsk-file-3 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAcUC9wb2x0ZS1wZHN3MTBwb3N0ZXItd2lwLnBkZk8RAYgAAAAAAYgAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xpwb2x0ZS1wZHN3MTBwb3N0ZXItd2lwLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABUAAAAgBCLzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpQOnBvbHRlLXBkc3cxMHBvc3Rlci13aXAucGRmAA4ANgAaAHAAbwBsAHQAZQAtAHAAZABzAHcAMQAwAHAAbwBzAHQAZQByAC0AdwBpAHAALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASAC0vTXkgRHJpdmUvUGFwZXJzL1AvcG9sdGUtcGRzdzEwcG9zdGVyLXdpcC5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAEMAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABzw==},
+ booktitle = {Poster Session at 5th Petascale Data Storage Workshop (PDSW 2010), co-located with Supercomputing 2010},
+ date-added = {2019-12-26 20:08:27 -0800},
+ date-modified = {2019-12-29 16:32:38 -0800},
+ keywords = {shortpapers, parallel, filesystems, cloudcomputing},
+ month = {November 15},
+ title = {PLFS and HDFS: Enabling Parallel Filesystem Semantics In The Cloud},
+ year = {2010}
+}
+
diff --git a/content/publication/polte-pdsw-10-poster/index.md b/content/publication/polte-pdsw-10-poster/index.md
new file mode 100644
index 00000000000..e708f0240fb
--- /dev/null
+++ b/content/publication/polte-pdsw-10-poster/index.md
@@ -0,0 +1,12 @@
+---
+title: "PLFS and HDFS: Enabling Parallel Filesystem Semantics In The Cloud"
+date: 2010-11-01
+publishDate: 2020-01-05T06:43:50.381180Z
+authors: ["Esteban Molina-Estolan, John Bent Milo Polte", "Garth Gibson", "Carlos Maltzahn", "Maya B. Gokhale", "Scott Brandt"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*Poster Session at 5th Petascale Data Storage Workshop (PDSW 2010), co-located with Supercomputing 2010*"
+tags: ["shortpapers", "parallel", "filesystems", "cloudcomputing"]
+---
+
diff --git a/content/publication/povzner-eurosys-08/cite.bib b/content/publication/povzner-eurosys-08/cite.bib
new file mode 100644
index 00000000000..f0b4d499fc4
--- /dev/null
+++ b/content/publication/povzner-eurosys-08/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{povzner:eurosys08,
+ abstract = {Guaranteed I/O performance is needed for a variety of applications ranging from real-time data collection to desktop multimedia to large-scale scientific simulations. Reservations on throughput, the standard measure of disk performance, fail to effectively manage disk performance due to the orders of magnitude difference between best-, average-, and worst-case response times, allowing reservation of less than 0.01% of the achievable bandwidth. We show that by reserving disk resources in terms of utilization it is possible to create a disk scheduler that supports reservation of nearly 100% of the disk resources, provides arbitrarily hard or soft guarantees depending upon application needs, and yields efficiency as good or better than best-effort disk schedulers tuned for performance. We present the architecture of our scheduler, prove the correctness of its algorithms, and provide results demonstrating its effectiveness.},
+ address = {Glasgow, Scottland},
+ author = {Anna Povzner and Tim Kaldewey and Scott A. Brandt and Richard Golding and Theodore Wong and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAXUC9wb3Z6bmVyLWV1cm9zeXMwOC5wZGZPEQF0AAAAAAF0AAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8VcG92em5lci1ldXJvc3lzMDgucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAVAAAAIAPS86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6UDpwb3Z6bmVyLWV1cm9zeXMwOC5wZGYAAA4ALAAVAHAAbwB2AHoAbgBlAHIALQBlAHUAcgBvAHMAeQBzADAAOAAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAKC9NeSBEcml2ZS9QYXBlcnMvUC9wb3Z6bmVyLWV1cm9zeXMwOC5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAPgAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAG2},
+ booktitle = {Eurosys 2008},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:12:26 -0700},
+ keywords = {papers, performance, management, storage, systems, fahrrad, rbed, realtime, qos},
+ month = {March 31 - April 4},
+ title = {Efficient Guaranteed Disk Request Scheduling with Fahrrad},
+ year = {2008}
+}
+
diff --git a/content/publication/povzner-eurosys-08/index.md b/content/publication/povzner-eurosys-08/index.md
new file mode 100644
index 00000000000..e1ed19ef583
--- /dev/null
+++ b/content/publication/povzner-eurosys-08/index.md
@@ -0,0 +1,12 @@
+---
+title: "Efficient Guaranteed Disk Request Scheduling with Fahrrad"
+date: 2008-03-01
+publishDate: 2020-01-05T13:33:06.006274Z
+authors: ["Anna Povzner", "Tim Kaldewey", "Scott A. Brandt", "Richard Golding", "Theodore Wong", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: "Guaranteed I/O performance is needed for a variety of applications ranging from real-time data collection to desktop multimedia to large-scale scientific simulations. Reservations on throughput, the standard measure of disk performance, fail to effectively manage disk performance due to the orders of magnitude difference between best-, average-, and worst-case response times, allowing reservation of less than 0.01% of the achievable bandwidth. We show that by reserving disk resources in terms of utilization it is possible to create a disk scheduler that supports reservation of nearly 100% of the disk resources, provides arbitrarily hard or soft guarantees depending upon application needs, and yields efficiency as good or better than best-effort disk schedulers tuned for performance. We present the architecture of our scheduler, prove the correctness of its algorithms, and provide results demonstrating its effectiveness."
+featured: false
+publication: "*Eurosys 2008*"
+tags: ["papers", "performance", "management", "storage", "systems", "fahrrad", "rbed", "realtime", "qos"]
+---
+
diff --git a/content/publication/povzner-fast-08-wip/cite.bib b/content/publication/povzner-fast-08-wip/cite.bib
new file mode 100644
index 00000000000..6490114d4ff
--- /dev/null
+++ b/content/publication/povzner-fast-08-wip/cite.bib
@@ -0,0 +1,11 @@
+@inproceedings{povzner:fast08wip,
+ author = {Anna Povzner and Scott A. Brandt and Richard Golding and Theodore Wong and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAXUC9wb3Z6bmVyLWZhc3QwOHdpcC5wZGZPEQF0AAAAAAF0AAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8VcG92em5lci1mYXN0MDh3aXAucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAVAAAAIAPS86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6UDpwb3Z6bmVyLWZhc3QwOHdpcC5wZGYAAA4ALAAVAHAAbwB2AHoAbgBlAHIALQBmAGEAcwB0ADAAOAB3AGkAcAAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAKC9NeSBEcml2ZS9QYXBlcnMvUC9wb3Z6bmVyLWZhc3QwOHdpcC5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAPgAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAG2},
+ booktitle = {Work in Progress at 6th USENIX Conference on File and Storage Technologies (FAST '08)},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2019-12-29 16:55:18 -0800},
+ keywords = {shortpapers, predictable, performance, storage},
+ title = {Virtualizing Disk Performance with Fahrrad},
+ year = {2008}
+}
+
diff --git a/content/publication/povzner-fast-08-wip/index.md b/content/publication/povzner-fast-08-wip/index.md
new file mode 100644
index 00000000000..0b1222bc727
--- /dev/null
+++ b/content/publication/povzner-fast-08-wip/index.md
@@ -0,0 +1,12 @@
+---
+title: "Virtualizing Disk Performance with Fahrrad"
+date: 2008-01-01
+publishDate: 2020-01-05T06:43:50.740554Z
+authors: ["Anna Povzner", "Scott A. Brandt", "Richard Golding", "Theodore Wong", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*Work in Progress at 6th USENIX Conference on File and Storage Technologies (FAST '08)*"
+tags: ["shortpapers", "predictable", "performance", "storage"]
+---
+
diff --git a/content/publication/povzner-osr-08/cite.bib b/content/publication/povzner-osr-08/cite.bib
new file mode 100644
index 00000000000..30130ef09ae
--- /dev/null
+++ b/content/publication/povzner-osr-08/cite.bib
@@ -0,0 +1,16 @@
+@article{povzner:osr08,
+ abstract = {Guaranteed I/O performance is needed for a variety of applications ranging from real-time data collection to desktop multimedia to large-scale scientific simulations. Reservations on throughput, the standard measure of disk performance, fail to effectively manage disk performance due to the orders of magnitude difference between best-, average-, and worst-case response times, allowing reservation of less than 0.01% of the achievable bandwidth. We show that by reserving disk resources in terms of utilization it is possible to create a disk scheduler that supports reservation of nearly 100% of the disk resources, provides arbitrarily hard or soft guarantees depending upon application needs, and yields efficiency as good or better than best-effort disk schedulers tuned for performance. We present the architecture of our scheduler, prove the correctness of its algorithms, and provide results demonstrating its effectiveness.},
+ author = {Anna Povzner and Tim Kaldewey and Scott A. Brandt and Richard Golding and Theodore Wong and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxATUC9wb3Z6bmVyLW9zcjA4LnBkZk8RAWQAAAAAAWQAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xFwb3Z6bmVyLW9zcjA4LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABUAAAAgA5LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpQOnBvdnpuZXItb3NyMDgucGRmAAAOACQAEQBwAG8AdgB6AG4AZQByAC0AbwBzAHIAMAA4AC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAkL015IERyaXZlL1BhcGVycy9QL3BvdnpuZXItb3NyMDgucGRmABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkADoAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABog==},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:12:06 -0700},
+ journal = {Operating Systems Review},
+ keywords = {papers, predictable, performance, storage, media, realtime},
+ month = {May},
+ number = {4},
+ pages = {13-25},
+ title = {Efficient Guaranteed Disk Request Scheduling with Fahrrad},
+ volume = {42},
+ year = {2008}
+}
+
diff --git a/content/publication/povzner-osr-08/index.md b/content/publication/povzner-osr-08/index.md
new file mode 100644
index 00000000000..9790d286df4
--- /dev/null
+++ b/content/publication/povzner-osr-08/index.md
@@ -0,0 +1,12 @@
+---
+title: "Efficient Guaranteed Disk Request Scheduling with Fahrrad"
+date: 2008-05-01
+publishDate: 2020-01-05T13:33:06.004612Z
+authors: ["Anna Povzner", "Tim Kaldewey", "Scott A. Brandt", "Richard Golding", "Theodore Wong", "Carlos Maltzahn"]
+publication_types: ["2"]
+abstract: "Guaranteed I/O performance is needed for a variety of applications ranging from real-time data collection to desktop multimedia to large-scale scientific simulations. Reservations on throughput, the standard measure of disk performance, fail to effectively manage disk performance due to the orders of magnitude difference between best-, average-, and worst-case response times, allowing reservation of less than 0.01% of the achievable bandwidth. We show that by reserving disk resources in terms of utilization it is possible to create a disk scheduler that supports reservation of nearly 100% of the disk resources, provides arbitrarily hard or soft guarantees depending upon application needs, and yields efficiency as good or better than best-effort disk schedulers tuned for performance. We present the architecture of our scheduler, prove the correctness of its algorithms, and provide results demonstrating its effectiveness."
+featured: false
+publication: "*Operating Systems Review*"
+tags: ["papers", "predictable", "performance", "storage", "media", "realtime"]
+---
+
diff --git a/content/publication/pye-fast-08-wip/cite.bib b/content/publication/pye-fast-08-wip/cite.bib
new file mode 100644
index 00000000000..4beb3da41cb
--- /dev/null
+++ b/content/publication/pye-fast-08-wip/cite.bib
@@ -0,0 +1,13 @@
+@inproceedings{pye:fast08wip,
+ address = {San Jose, CA},
+ author = {Ian Pye and Scott Brandt and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxATUC9weWUtZmFzdDA4d2lwLnBkZk8RAWQAAAAAAWQAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xFweWUtZmFzdDA4d2lwLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABUAAAAgA5LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpQOnB5ZS1mYXN0MDh3aXAucGRmAAAOACQAEQBwAHkAZQAtAGYAYQBzAHQAMAA4AHcAaQBwAC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAkL015IERyaXZlL1BhcGVycy9QL3B5ZS1mYXN0MDh3aXAucGRmABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkADoAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABog==},
+ booktitle = {Work-in-Progress Session of the USENIX Conference on File and Storage Technology (FAST 2008)},
+ date-added = {2019-12-29 16:29:20 -0800},
+ date-modified = {2019-12-29 16:30:47 -0800},
+ keywords = {shortpapers, p2p, filesystems, global},
+ month = {February 26-29},
+ title = {Ringer: A Global-Scale Lightweight P2P File Service},
+ year = {2008}
+}
+
diff --git a/content/publication/pye-fast-08-wip/index.md b/content/publication/pye-fast-08-wip/index.md
new file mode 100644
index 00000000000..91d383ddd19
--- /dev/null
+++ b/content/publication/pye-fast-08-wip/index.md
@@ -0,0 +1,12 @@
+---
+title: "Ringer: A Global-Scale Lightweight P2P File Service"
+date: 2008-02-01
+publishDate: 2020-01-05T06:43:50.374876Z
+authors: ["Ian Pye", "Scott Brandt", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*Work-in-Progress Session of the USENIX Conference on File and Storage Technology (FAST 2008)*"
+tags: ["shortpapers", "p2p", "filesystems", "global"]
+---
+
diff --git a/content/publication/rodriguez-arxiv-21/cite.bib b/content/publication/rodriguez-arxiv-21/cite.bib
new file mode 100644
index 00000000000..2f96fc352ed
--- /dev/null
+++ b/content/publication/rodriguez-arxiv-21/cite.bib
@@ -0,0 +1,12 @@
+@unpublished{rodriguez:arxiv21,
+ author = {Sebastiaan Alvarez Rodriguez and Jayjeet Chakraborty and Aaron Chu and Ivo Jimenez and Jeff LeFevre and Carlos Maltzahn and Alexandru Uta},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAyLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1EtUi9yb2RyaWd1ZXotYXJ4aXYyMS5wZGZPEQF6AAAAAAF6AAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8Vcm9kcmlndWV6LWFyeGl2MjEucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAA1EtUgAAAgA8LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpRLVI6cm9kcmlndWV6LWFyeGl2MjEucGRmAA4ALAAVAHIAbwBkAHIAaQBnAHUAZQB6AC0AYQByAHgAaQB2ADIAMQAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAOlVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL1EtUi9yb2RyaWd1ZXotYXJ4aXYyMS5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQAWQAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHX},
+ date-added = {2021-07-23 11:42:12 -0700},
+ date-modified = {2021-07-23 11:45:00 -0700},
+ keywords = {papers, spark, arrow, performance},
+ month = {June 24},
+ note = {arxiv.org/abs/2106.13020 [cs.DC]},
+ title = {Zero-Cost, Arrow-Enabled Data Interface for Apache Spark},
+ year = {2021}
+}
+
diff --git a/content/publication/rodriguez-arxiv-21/index.md b/content/publication/rodriguez-arxiv-21/index.md
new file mode 100644
index 00000000000..2bd099d337d
--- /dev/null
+++ b/content/publication/rodriguez-arxiv-21/index.md
@@ -0,0 +1,15 @@
+---
+title: "Zero-Cost, Arrow-Enabled Data Interface for Apache Spark"
+date: 2021-06-01
+publishDate: 2021-07-23T18:52:38.468504Z
+authors: ["Sebastiaan Alvarez Rodriguez", "Jayjeet Chakraborty", "Aaron Chu", "Ivo Jimenez", "Jeff LeFevre", "Carlos Maltzahn", "Alexandru Uta"]
+publication_types: ["3"]
+abstract: "Distributed data processing ecosystems are widespread and their components are highly specialized, such that efficient interoperability is urgent. Recently, Apache Arrow was chosen by the community to serve as a format mediator, providing efficient in-memory data representation. Arrow enables efficient data movement between data processing and storage engines, significantly improving interoperability and overall performance. In this work, we design a new zero-cost data interoperability layer between Apache Spark and Arrow-based data sources through the Arrow Dataset API. Our novel data interface helps separate the computation (Spark) and data (Arrow) layers. This enables practitioners to seamlessly use Spark to access data from all Arrow Dataset API-enabled data sources and frameworks. To benefit our community, we open-source our work and show that consuming data through Apache Arrow is zero-cost: our novel data interface is either on-par or more performant than native Spark."
+featured: false
+publication: "arXiv:2106.13020 [cs.DC]"
+tags: ["papers", "spark", "arrow", "performance"]
+projects:
+- programmable-storage
+- declstore
+- skyhookdm
+---
diff --git a/content/publication/rodriguez-bigdata-21/cite.bib b/content/publication/rodriguez-bigdata-21/cite.bib
new file mode 100644
index 00000000000..231720d2818
--- /dev/null
+++ b/content/publication/rodriguez-bigdata-21/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{rodriguez:bigdata21,
+ abstract = {Distributed data processing ecosystems are widespread and their components are highly specialized, such that efficient interoperability is urgent. Recently, Apache Arrow was chosen by the community to serve as a format mediator, providing efficient in-memory data representation. Arrow enables efficient data movement between data processing and storage engines, significantly improving interoperability and overall performance. In this work, we design a new zero-cost data interoperability layer between Apache Spark and Arrow-based data sources through the Arrow Dataset API. Our novel data interface helps separate the computation (Spark) and data (Arrow) layers. This enables practitioners to seamlessly use Spark to access data from all Arrow Dataset API-enabled data sources and frameworks. To benefit our community, we open-source our work and show that consuming data through Apache Arrow is zero-cost: our novel data interface is either on-par or more performant than native Spark.},
+ address = {Virtual Event},
+ author = {Sebastiaan Alvarez Rodriguez and Jayjeet Chakraborty and Aaron Chu and Ivo Jimenez and Jeff LeFevre and Carlos Maltzahn and Alexandru Uta},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA0Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1EtUi9yb2RyaWd1ZXotYmlnZGF0YTIxLnBkZk8RAYIAAAAAAYIAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xdyb2RyaWd1ZXotYmlnZGF0YTIxLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAADUS1SAAACAD4vOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOlEtUjpyb2RyaWd1ZXotYmlnZGF0YTIxLnBkZgAOADAAFwByAG8AZAByAGkAZwB1AGUAegAtAGIAaQBnAGQAYQB0AGEAMgAxAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA8VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvUS1SL3JvZHJpZ3Vlei1iaWdkYXRhMjEucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFsAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAAB4Q==},
+ booktitle = {2021 IEEE International Conference on Big Data (IEEE BigData 2021)},
+ date-added = {2022-04-11 19:33:51 -0700},
+ date-modified = {2022-04-11 19:59:07 -0700},
+ keywords = {papers, spark, arrow, performance, nsf1836650},
+ month = {December 15-18},
+ title = {Zero-Cost, Arrow-Enabled Data Interface for Apache Spark},
+ year = {2021}
+}
+
diff --git a/content/publication/rodriguez-bigdata-21/index.md b/content/publication/rodriguez-bigdata-21/index.md
new file mode 100644
index 00000000000..0bce7425553
--- /dev/null
+++ b/content/publication/rodriguez-bigdata-21/index.md
@@ -0,0 +1,57 @@
+---
+# Documentation: https://wowchemy.com/docs/managing-content/
+
+title: Zero-Cost, Arrow-Enabled Data Interface for Apache Spark
+subtitle: ''
+summary: ''
+authors:
+- Sebastiaan Alvarez Rodriguez
+- Jayjeet Chakraborty
+- Aaron Chu
+- Ivo Jimenez
+- Jeff LeFevre
+- Carlos Maltzahn
+- Alexandru Uta
+tags:
+- papers
+- spark
+- arrow
+- performance
+- nsf1836650
+categories: []
+date: '2021-12-01'
+lastmod: 2022-04-25T15:07:59-07:00
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Focal points: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight.
+image:
+ caption: ''
+ focal_point: ''
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects: []
+publishDate: '2022-04-25T22:07:59.443656Z'
+publication_types:
+- '1'
+abstract: 'Distributed data processing ecosystems are widespread and their components
+ are highly specialized, such that efficient interoperability is urgent. Recently,
+ Apache Arrow was chosen by the community to serve as a format mediator, providing
+ efficient in-memory data representation. Arrow enables efficient data movement between
+ data processing and storage engines, significantly improving interoperability and
+ overall performance. In this work, we design a new zero-cost data interoperability
+ layer between Apache Spark and Arrow-based data sources through the Arrow Dataset
+ API. Our novel data interface helps separate the computation (Spark) and data (Arrow)
+ layers. This enables practitioners to seamlessly use Spark to access data from all
+ Arrow Dataset API-enabled data sources and frameworks. To benefit our community,
+ we open-source our work and show that consuming data through Apache Arrow is zero-cost:
+ our novel data interface is either on-par or more performant than native Spark.'
+publication: '*2021 IEEE International Conference on Big Data (IEEE BigData 2021)*'
+---
diff --git a/content/publication/rose-caise-92/cite.bib b/content/publication/rose-caise-92/cite.bib
new file mode 100644
index 00000000000..4dbe27eb5cd
--- /dev/null
+++ b/content/publication/rose-caise-92/cite.bib
@@ -0,0 +1,19 @@
+@inproceedings{rose:caise92,
+ abstract = {Repositories provide the information system's support to layer software environments. Initially, repository technology has been dominated by object representation issues. Teams are not part of the ball game. In this paper, we propose the concept of sharing processes which supports distribution and sharing of objects and tasks by teams. Sharing processes are formally specified as classes of non-deterministic f'mite automata connected to each other by deduction rules. They are intended to coordinate object access and communication for task distribution in large development projects. In particular, we show how interactions between both sharings improve object management.},
+ address = {Manchester, UK},
+ author = {Thomas Rose and Carlos Maltzahn and Matthias Jarke},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAUUS1SL3Jvc2UtY2Fpc2U5Mi5wZGZPEQFmAAAAAAFmAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8Qcm9zZS1jYWlzZTkyLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAA1EtUgAAAgA6LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpRLVI6cm9zZS1jYWlzZTkyLnBkZgAOACIAEAByAG8AcwBlAC0AYwBhAGkAcwBlADkAMgAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAJS9NeSBEcml2ZS9QYXBlcnMvUS1SL3Jvc2UtY2Fpc2U5Mi5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkADsAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABpQ==},
+ booktitle = {Advanced Information Systems Engineering (CAiSE'92)},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:14:53 -0700},
+ editor = {Pericles Loucopoulos},
+ keywords = {papers, sharing, cscw, datamanagement},
+ month = {May 12--15},
+ pages = {17--32},
+ publisher = {Springer Berlin / Heidelberg},
+ series = {Lecture Notes in Computer Science},
+ title = {Integrating object and agent worlds},
+ volume = {593},
+ year = {1992}
+}
+
diff --git a/content/publication/rose-caise-92/index.md b/content/publication/rose-caise-92/index.md
new file mode 100644
index 00000000000..b8bc372889c
--- /dev/null
+++ b/content/publication/rose-caise-92/index.md
@@ -0,0 +1,12 @@
+---
+title: "Integrating object and agent worlds"
+date: 1992-05-01
+publishDate: 2020-01-05T13:33:06.008206Z
+authors: ["Thomas Rose", "Carlos Maltzahn", "Matthias Jarke"]
+publication_types: ["1"]
+abstract: "Repositories provide the information system's support to layer software environments. Initially, repository technology has been dominated by object representation issues. Teams are not part of the ball game. In this paper, we propose the concept of sharing processes which supports distribution and sharing of objects and tasks by teams. Sharing processes are formally specified as classes of non-deterministic f'mite automata connected to each other by deduction rules. They are intended to coordinate object access and communication for task distribution in large development projects. In particular, we show how interactions between both sharings improve object management."
+featured: false
+publication: "*Advanced Information Systems Engineering (CAiSE'92)*"
+tags: ["papers", "sharing", "cscw", "datamanagement"]
+---
+
diff --git a/content/publication/rose-sej-91/cite.bib b/content/publication/rose-sej-91/cite.bib
new file mode 100644
index 00000000000..9f2c09bd4d8
--- /dev/null
+++ b/content/publication/rose-sej-91/cite.bib
@@ -0,0 +1,16 @@
+@article{rose:sej91,
+ abstract = {In the context of the ESPRIT project DAIDA, we have developed an experimental environment intended to achieve consistency-in-the-large in a multi-person setting. Our conceptual model of configuration processes, the CAD$\,^∘$ model, centres around decisions that work on configured objects and are subject to structured conversations. The environment, extending the knowledge-based software information system ConceptBase, supports co-operation within development teams by integrating models and tools for argumentation and co-ordination with those for versioning and configuration. Versioning decisions are discussed and decided on within an argument editor, and executed by specialised tools for programming-in-the-small. Tasks are assigned and monitored through a contract tool, and carried out within co-ordinated workspaces under a conflict-tolerant transaction protocol. Consistent configuration and reconfiguration of local results is supported by a logic-based configuration assistant.},
+ author = {Thomas Rose and Matthias Jarke and Martin Gocek and Carlos Maltzahn and Hans Nissen},
+ bdsk-url-1 = {http://dx.doi.org/10.1049/sej.1991.0034},
+ date-added = {2014-06-27 02:43:48 +0000},
+ date-modified = {2020-01-04 21:56:48 -0700},
+ journal = {Software Engineering Journal},
+ keywords = {papers, software, programming, collaborative},
+ month = {September},
+ number = {5},
+ pages = {332--346},
+ title = {A Decision-Based Configuration Process Environment},
+ volume = {6},
+ year = {1991}
+}
+
diff --git a/content/publication/rose-sej-91/index.md b/content/publication/rose-sej-91/index.md
new file mode 100644
index 00000000000..67e3d49c80c
--- /dev/null
+++ b/content/publication/rose-sej-91/index.md
@@ -0,0 +1,12 @@
+---
+title: "A Decision-Based Configuration Process Environment"
+date: 1991-09-01
+publishDate: 2020-01-05T06:43:50.493937Z
+authors: ["Thomas Rose", "Matthias Jarke", "Martin Gocek", "Carlos Maltzahn", "Hans Nissen"]
+publication_types: ["2"]
+abstract: "In the context of the ESPRIT project DAIDA, we have developed an experimental environment intended to achieve consistency-in-the-large in a multi-person setting. Our conceptual model of configuration processes, the CAD$,^∘$ model, centres around decisions that work on configured objects and are subject to structured conversations. The environment, extending the knowledge-based software information system ConceptBase, supports co-operation within development teams by integrating models and tools for argumentation and co-ordination with those for versioning and configuration. Versioning decisions are discussed and decided on within an argument editor, and executed by specialised tools for programming-in-the-small. Tasks are assigned and monitored through a contract tool, and carried out within co-ordinated workspaces under a conflict-tolerant transaction protocol. Consistent configuration and reconfiguration of local results is supported by a logic-based configuration assistant."
+featured: false
+publication: "*Software Engineering Journal*"
+tags: ["papers", "software", "programming", "collaborative"]
+---
+
diff --git a/content/publication/sevilla-ccgrid-18/cite.bib b/content/publication/sevilla-ccgrid-18/cite.bib
new file mode 100644
index 00000000000..78a717b7b09
--- /dev/null
+++ b/content/publication/sevilla-ccgrid-18/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{sevilla:ccgrid18,
+ abstract = {Our analysis of the key-value activity generated by the ParSplice molecular dynamics simulation demonstrates the need for more complex cache management strategies. Baseline measurements show clear key access patterns and hot spots that offer significant opportunity for optimization. We use the data management language and policy engine from the Mantle system to dynamically explore a variety of techniques, ranging from basic algorithms and heuristics to statistical models, calculus, and machine learning. While Mantle was originally designed for distributed file systems, we show how the collection of abstractions effectively decomposes the problem into manageable policies for a different application and storage system. Our exploration of this space results in a dynamically sized cache policy that does not sacrifice any performance while using 32-66% less memory than the default ParSplice configuration.},
+ address = {Washington, DC},
+ author = {Michael A. Sevilla and Carlos Maltzahn and Peter Alvaro and Reza Nasirigerdeh and Bradley W. Settlemyer and Danny Perez and David Rich and Galen M. Shipman},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAWUy9zZXZpbGxhLWNjZ3JpZDE4LnBkZk8RAXAAAAAAAXAAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xRzZXZpbGxhLWNjZ3JpZDE4LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAFBERiBDQVJPAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABUwAAAgA8LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpTOnNldmlsbGEtY2NncmlkMTgucGRmAA4AKgAUAHMAZQB2AGkAbABsAGEALQBjAGMAZwByAGkAZAAxADgALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACcvTXkgRHJpdmUvUGFwZXJzL1Mvc2V2aWxsYS1jY2dyaWQxOC5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAD0AAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABsQ==},
+ booktitle = {CCGRID '18},
+ date-added = {2018-07-01 21:56:37 +0000},
+ date-modified = {2020-01-04 21:31:45 -0700},
+ keywords = {papers, caching, programmable, storage, hpc},
+ month = {May 1-4},
+ title = {Programmable Caches with a Data Management Language & Policy Engine},
+ year = {2018}
+}
+
diff --git a/content/publication/sevilla-ccgrid-18/index.md b/content/publication/sevilla-ccgrid-18/index.md
new file mode 100644
index 00000000000..fe2a0951152
--- /dev/null
+++ b/content/publication/sevilla-ccgrid-18/index.md
@@ -0,0 +1,14 @@
+---
+title: "Programmable Caches with a Data Management Language & Policy Engine"
+date: 2018-05-01
+publishDate: 2020-01-05T06:43:50.429426Z
+authors: ["Michael A. Sevilla", "Carlos Maltzahn", "Peter Alvaro", "Reza Nasirigerdeh", "Bradley W. Settlemyer", "Danny Perez", "David Rich", "Galen M. Shipman"]
+publication_types: ["1"]
+abstract: "Our analysis of the key-value activity generated by the ParSplice molecular dynamics simulation demonstrates the need for more complex cache management strategies. Baseline measurements show clear key access patterns and hot spots that offer significant opportunity for optimization. We use the data management language and policy engine from the Mantle system to dynamically explore a variety of techniques, ranging from basic algorithms and heuristics to statistical models, calculus, and machine learning. While Mantle was originally designed for distributed file systems, we show how the collection of abstractions effectively decomposes the problem into manageable policies for a different application and storage system. Our exploration of this space results in a dynamically sized cache policy that does not sacrifice any performance while using 32-66% less memory than the default ParSplice configuration."
+featured: false
+publication: "*CCGRID '18*"
+tags: ["papers", "caching", "programmable", "storage", "hpc"]
+projects:
+- programmable-storage
+---
+
diff --git a/content/publication/sevilla-discs-13/cite.bib b/content/publication/sevilla-discs-13/cite.bib
new file mode 100644
index 00000000000..dc6717c6a81
--- /dev/null
+++ b/content/publication/sevilla-discs-13/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{sevilla:discs13,
+ abstract = {When data grows too large, we scale to larger systems, either by scaling out or up. It is understood that scale-out and scale-up have different complexities and bottlenecks but a thorough comparison of the two architectures is challenging because of the diversity of their programming interfaces, their significantly different system environments, and their sensitivity to workload specifics. In this paper, we propose a novel comparison framework based on MapReduce that accounts for the application, its requirements, and its input size by considering input, software, and hardware parameters. Part of this framework requires implementing scale-out properties on scale-up and we discuss the complex trade-offs, interactions, and dependencies of these properties for two specific case studies (word count and sort). This work lays the foundation for future work in quantifying design decisions and in building a system that automatically compares architectures and selects the best one.},
+ address = {Denver, CO},
+ author = {Micheal Sevilla and Ike Nassi and Kleoni Ioannidou and Scott Brandt and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAVUy9zZXZpbGxhLWRpc2NzMTMucGRmTxEBbAAAAAABbAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////E3NldmlsbGEtZGlzY3MxMy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFTAAACADsvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOlM6c2V2aWxsYS1kaXNjczEzLnBkZgAADgAoABMAcwBlAHYAaQBsAGwAYQAtAGQAaQBzAGMAcwAxADMALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACYvTXkgRHJpdmUvUGFwZXJzL1Mvc2V2aWxsYS1kaXNjczEzLnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA8AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAaw=},
+ booktitle = {DISCS 2013 at SC13},
+ date-added = {2014-07-11 20:53:58 +0000},
+ date-modified = {2020-01-04 21:55:43 -0700},
+ keywords = {papers, scalable, systems, distributed, sharedmemory},
+ month = {November 18},
+ title = {A Framework for an In-depth Comparison of Scale-up and Scale-out},
+ year = {2013}
+}
+
diff --git a/content/publication/sevilla-discs-13/index.md b/content/publication/sevilla-discs-13/index.md
new file mode 100644
index 00000000000..ed97fe59910
--- /dev/null
+++ b/content/publication/sevilla-discs-13/index.md
@@ -0,0 +1,12 @@
+---
+title: "A Framework for an In-depth Comparison of Scale-up and Scale-out"
+date: 2013-11-01
+publishDate: 2020-01-05T06:43:50.489544Z
+authors: ["Micheal Sevilla", "Ike Nassi", "Kleoni Ioannidou", "Scott Brandt", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: "When data grows too large, we scale to larger systems, either by scaling out or up. It is understood that scale-out and scale-up have different complexities and bottlenecks but a thorough comparison of the two architectures is challenging because of the diversity of their programming interfaces, their significantly different system environments, and their sensitivity to workload specifics. In this paper, we propose a novel comparison framework based on MapReduce that accounts for the application, its requirements, and its input size by considering input, software, and hardware parameters. Part of this framework requires implementing scale-out properties on scale-up and we discuss the complex trade-offs, interactions, and dependencies of these properties for two specific case studies (word count and sort). This work lays the foundation for future work in quantifying design decisions and in building a system that automatically compares architectures and selects the best one."
+featured: false
+publication: "*DISCS 2013 at SC13*"
+tags: ["papers", "scalable", "systems", "distributed", "sharedmemory"]
+---
+
diff --git a/content/publication/sevilla-eurosys-17/cite.bib b/content/publication/sevilla-eurosys-17/cite.bib
new file mode 100644
index 00000000000..5852a6a4c72
--- /dev/null
+++ b/content/publication/sevilla-eurosys-17/cite.bib
@@ -0,0 +1,15 @@
+@inproceedings{sevilla:eurosys17,
+ abstract = {Storage systems need to support high-performance for special-purpose data processing applications that run on an evolving storage device technology landscape. This puts tremendous pressure on storage systems to support rapid change both in terms of their interfaces and their performance. But adapting storage systems can be difficult because unprincipled changes might jeopardize years of code-hardening and performance optimization efforts that were necessary for users to entrust their data to the storage system. We introduce the programmable storage approach, which exposes internal services and abstractions of the storage stack as building blocks for higher-level services. We also build a prototype to explore how existing abstractions of common storage system services can be leveraged to adapt to the needs of new data processing systems and the increasing variety of storage devices. We illustrate the advantages and challenges of this approach by composing existing internal abstractions into two new higher-level services: a file system metadata load balancer and a high-performance distributed shared-log. The evaluation demonstrates that our services inherit desirable qualities of the back-end storage system, including the ability to balance load, efficiently propagate service metadata, recover from failure, and navigate trade-offs between latency and throughput using leases.},
+ address = {Belgrade, Serbia},
+ author = {Michael A. Sevilla and Noah Watkins and Ivo Jimenez and Peter Alvaro and Shel Finkelstein and Jeff LeFevre and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAXUy9zZXZpbGxhLWV1cm9zeXMxNy5wZGZPEQF0AAAAAAF0AAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8Vc2V2aWxsYS1ldXJvc3lzMTcucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAVMAAAIAPS86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6UzpzZXZpbGxhLWV1cm9zeXMxNy5wZGYAAA4ALAAVAHMAZQB2AGkAbABsAGEALQBlAHUAcgBvAHMAeQBzADEANwAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAKC9NeSBEcml2ZS9QYXBlcnMvUy9zZXZpbGxhLWV1cm9zeXMxNy5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAPgAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAG2},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAeUy9zZXZpbGxhLWV1cm9zeXMxNy1zbGlkZXMucGRmTxEBkAAAAAABkAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////HHNldmlsbGEtZXVyb3N5czE3LXNsaWRlcy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFTAAACAEQvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOlM6c2V2aWxsYS1ldXJvc3lzMTctc2xpZGVzLnBkZgAOADoAHABzAGUAdgBpAGwAbABhAC0AZQB1AHIAbwBzAHkAcwAxADcALQBzAGwAaQBkAGUAcwAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIALy9NeSBEcml2ZS9QYXBlcnMvUy9zZXZpbGxhLWV1cm9zeXMxNy1zbGlkZXMucGRmAAATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJABFAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAdk=},
+ booktitle = {EuroSys '17},
+ date-added = {2017-03-14 22:06:29 +0000},
+ date-modified = {2020-01-04 21:42:47 -0700},
+ keywords = {papers, storage, systems, programmable, abstraction},
+ month = {April 23-26},
+ title = {Malacology: A Programmable Storage System},
+ year = {2017}
+}
+
diff --git a/content/publication/sevilla-eurosys-17/index.md b/content/publication/sevilla-eurosys-17/index.md
new file mode 100644
index 00000000000..006e162996c
--- /dev/null
+++ b/content/publication/sevilla-eurosys-17/index.md
@@ -0,0 +1,14 @@
+---
+title: "Malacology: A Programmable Storage System"
+date: 2017-04-01
+publishDate: 2020-01-05T06:43:50.448168Z
+authors: ["Michael A. Sevilla", "Noah Watkins", "Ivo Jimenez", "Peter Alvaro", "Shel Finkelstein", "Jeff LeFevre", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: "Storage systems need to support high-performance for special-purpose data processing applications that run on an evolving storage device technology landscape. This puts tremendous pressure on storage systems to support rapid change both in terms of their interfaces and their performance. But adapting storage systems can be difficult because unprincipled changes might jeopardize years of code-hardening and performance optimization efforts that were necessary for users to entrust their data to the storage system. We introduce the programmable storage approach, which exposes internal services and abstractions of the storage stack as building blocks for higher-level services. We also build a prototype to explore how existing abstractions of common storage system services can be leveraged to adapt to the needs of new data processing systems and the increasing variety of storage devices. We illustrate the advantages and challenges of this approach by composing existing internal abstractions into two new higher-level services: a file system metadata load balancer and a high-performance distributed shared-log. The evaluation demonstrates that our services inherit desirable qualities of the back-end storage system, including the ability to balance load, efficiently propagate service metadata, recover from failure, and navigate trade-offs between latency and throughput using leases."
+featured: false
+publication: "*EuroSys '17*"
+tags: ["papers", "storage", "systems", "programmable", "abstraction"]
+projects:
+- programmable-storage
+---
+
diff --git a/content/publication/sevilla-fast-14-wip/cite.bib b/content/publication/sevilla-fast-14-wip/cite.bib
new file mode 100644
index 00000000000..1b26da9eb40
--- /dev/null
+++ b/content/publication/sevilla-fast-14-wip/cite.bib
@@ -0,0 +1,12 @@
+@inproceedings{sevilla:fast14wip,
+ address = {San Jose, CA},
+ author = {Michael Sevilla and Scott Brandt and Carlos Maltzahn and Ike Nassi and Sam Fineberg},
+ booktitle = {Work-in-Progress and Poster Session at the 12th USENIX Conference on File and Storage Technology (FAST 2014)},
+ date-added = {2019-12-26 19:20:27 -0800},
+ date-modified = {2019-12-29 16:35:02 -0800},
+ keywords = {shortpapers, filesystems, metadata, loadbalancing},
+ month = {February 17-20},
+ title = {Exploring Resource Migration using the CephFS Metadata cluster},
+ year = {2014}
+}
+
diff --git a/content/publication/sevilla-fast-14-wip/index.md b/content/publication/sevilla-fast-14-wip/index.md
new file mode 100644
index 00000000000..544c86cf376
--- /dev/null
+++ b/content/publication/sevilla-fast-14-wip/index.md
@@ -0,0 +1,12 @@
+---
+title: "Exploring Resource Migration using the CephFS Metadata cluster"
+date: 2014-02-01
+publishDate: 2020-01-05T06:43:50.386081Z
+authors: ["Michael Sevilla", "Scott Brandt", "Carlos Maltzahn", "Ike Nassi", "Sam Fineberg"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*Work-in-Progress and Poster Session at the 12th USENIX Conference on File and Storage Technology (FAST 2014)*"
+tags: ["shortpapers", "filesystems", "metadata", "loadbalancing"]
+---
+
diff --git a/content/publication/sevilla-hotstorage-18/cite.bib b/content/publication/sevilla-hotstorage-18/cite.bib
new file mode 100644
index 00000000000..c52cd843430
--- /dev/null
+++ b/content/publication/sevilla-hotstorage-18/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{sevilla:hotstorage18,
+ address = {Boston, MA},
+ annote = {Submitted to HotStorage'18},
+ author = {Michael A. Sevilla and Reza Nasirigerdeh and Carlos Maltzahn and Jeff LeFevre and Noah Watkins and Peter Alvaro and Margaret Lawson and Jay Lofstead and Jim Pivarski},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAaUy9zZXZpbGxhLWhvdHN0b3JhZ2UxOC5wZGZPEQGAAAAAAAGAAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8Yc2V2aWxsYS1ob3RzdG9yYWdlMTgucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAABQREYgQ0FSTwABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAVMAAAIAQC86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6UzpzZXZpbGxhLWhvdHN0b3JhZ2UxOC5wZGYADgAyABgAcwBlAHYAaQBsAGwAYQAtAGgAbwB0AHMAdABvAHIAYQBnAGUAMQA4AC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgArL015IERyaXZlL1BhcGVycy9TL3NldmlsbGEtaG90c3RvcmFnZTE4LnBkZgAAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAQQAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHF},
+ booktitle = {HotStorage '18},
+ date-added = {2018-09-04 00:39:56 -0700},
+ date-modified = {2018-09-04 00:41:15 -0700},
+ keywords = {papers, metadata, filesystems, scalable, naming},
+ month = {July 9-10},
+ title = {Tintenfisch: File System Namespace Schemas and Generators},
+ year = {2018}
+}
+
diff --git a/content/publication/sevilla-hotstorage-18/index.md b/content/publication/sevilla-hotstorage-18/index.md
new file mode 100644
index 00000000000..13f312fd12b
--- /dev/null
+++ b/content/publication/sevilla-hotstorage-18/index.md
@@ -0,0 +1,16 @@
+---
+title: "Tintenfisch: File System Namespace Schemas and Generators"
+date: 2018-07-01
+publishDate: 2020-01-05T06:43:50.427099Z
+authors: ["Michael A. Sevilla", "Reza Nasirigerdeh", "Carlos Maltzahn", "Jeff LeFevre", "Noah Watkins", "Peter Alvaro", "Margaret Lawson", "Jay Lofstead", "Jim Pivarski"]
+publication_types: ["1"]
+abstract: "The file system metadata service is the scalability bottleneck for many of today’s workloads. Common approaches for attacking this “metadata scaling wall” include: caching inodes on clients and servers, caching parent inodes for path traversal, and dynamic caching policies that exploit workload locality. These caches reduce the number of remote procedure calls (RPCs) but the effectiveness is dependent on the overhead of maintaining cache coherence and the administrator’s ability to select the best cache size for the given workloads. Recent work reduces the number of metadata RPCs to 1 without using a cache at all, by letting clients “decouple” the subtrees from the global namespace so that they can do metadata operations locally. Even with this technique, we show that file system metadata is still a bottleneck because namespaces for today’s workloads can be very large. The size is problematic for reads because metadata needs to be transferred and materialized.
+
+The management techniques for file system metadata assume that namespaces have no structure but we observe that this is not the case for all workloads. We propose Tintenfisch, a file system that allows users to succinctly express the structure of the metadata they intend to create. If a user can express the structure of the namespace, Tintenfisch clients and servers can (1) compact metadata, (2) modify large namespaces more quickly, and (3) generate only relevant parts of the namespace. This reduces network traffic, storage footprints, and the number of overall metadata operations needed to complete a job."
+featured: false
+publication: "*HotStorage '18*"
+tags: ["papers", "metadata", "filesystems", "scalable", "naming"]
+projects:
+- programmable-storage
+---
+
diff --git a/content/publication/sevilla-ipdps-18/cite.bib b/content/publication/sevilla-ipdps-18/cite.bib
new file mode 100644
index 00000000000..923fb8e0128
--- /dev/null
+++ b/content/publication/sevilla-ipdps-18/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{sevilla:ipdps18,
+ abstract = {HPC and data center scale application developers are abandoning POSIX IO because file system metadata synchronization and serialization overheads of providing strong consistency and durability are too costly -- and often unnecessary -- for their applications. Unfortunately, designing file systems with weaker consistency or durability semantics excludes applications that rely on stronger guarantees, forcing developers to re-write their applications or deploy them on a different system. We present a framework and API that lets administrators specify their consistency/durability requirements and dynamically assign them to subtrees in the same namespace, allowing administrators to optimize subtrees over time and space for different workloads. We show similar speedups to related work but more importantly, we show performance improvements when we custom fit subtree semantics to applications such as checkpoint-restart (91.7x speedup), user home directories (0.03 standard deviation from optimal), and users checking for partial results (2% overhead).},
+ address = {Vancouver, BC, Canada},
+ author = {Michael A. Sevilla and Ivo Jimenez and Noah Watkins and Jeff LeFevre and Peter Alvaro and Shel Finkelstein and Patrick Donnelly and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAVUy9zZXZpbGxhLWlwZHBzMTgucGRmTxEBbAAAAAABbAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////E3NldmlsbGEtaXBkcHMxOC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAUERGIENBUk8AAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFTAAACADsvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOlM6c2V2aWxsYS1pcGRwczE4LnBkZgAADgAoABMAcwBlAHYAaQBsAGwAYQAtAGkAcABkAHAAcwAxADgALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACYvTXkgRHJpdmUvUGFwZXJzL1Mvc2V2aWxsYS1pcGRwczE4LnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA8AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAaw=},
+ booktitle = {IPDPS 2018},
+ date-added = {2018-03-19 21:24:16 +0000},
+ date-modified = {2020-01-04 22:56:45 -0700},
+ keywords = {papers, metadata, datamanagement, programmable, filesystems, storage, systems},
+ month = {May 21-25},
+ title = {Cudele: An API and Framework for Programmable Consistency and Durability in a Global Namespace},
+ year = {2018}
+}
+
diff --git a/content/publication/sevilla-ipdps-18/index.md b/content/publication/sevilla-ipdps-18/index.md
new file mode 100644
index 00000000000..4ccdacd1279
--- /dev/null
+++ b/content/publication/sevilla-ipdps-18/index.md
@@ -0,0 +1,22 @@
+---
+title: "Cudele: An API and Framework for Programmable Consistency and Durability in a Global Namespace"
+date: 2018-05-01
+publishDate: 2020-01-05T06:43:50.439396Z
+authors: ["Michael A. Sevilla", "Ivo Jimenez", "Noah Watkins", "Jeff LeFevre", "Peter Alvaro", "Shel Finkelstein", "Patrick Donnelly", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: "HPC and data center scale application developers are abandoning POSIX IO because file system metadata synchronization and serialization overheads of providing strong consistency and durability are too costly -- and often unnecessary -- for their applications. Unfortunately, designing file systems with weaker consistency or durability semantics excludes applications that rely on stronger guarantees, forcing developers to re-write their applications or deploy them on a different system. We present a framework and API that lets administrators specify their consistency/durability requirements and dynamically assign them to subtrees in the same namespace, allowing administrators to optimize subtrees over time and space for different workloads. We show similar speedups to related work but more importantly, we show performance improvements when we custom fit subtree semantics to applications such as checkpoint-restart (91.7x speedup), user home directories (0.03 standard deviation from optimal), and users checking for partial results (2% overhead)."
+featured: false
+publication: "*IPDPS 2018*"
+tags: ["papers", "metadata", "datamanagement", "programmable", "filesystems", "storage", "systems"]
+projects:
+- programmable-storage
+links:
+ - name: "Tech report with artifact links"
+ url: https://www-test.soe.ucsc.edu/sites/default/files/technical-reports/UCSC-SOE-18-01.pdf
+---
+
+{{< callout note >}}
+
+The links to reproducibility artifacts were inadvertendly removed in the final version of the paper. However, the associated tech report ([UCSC-SOE-18-01](https://www-test.soe.ucsc.edu/sites/default/files/technical-reports/UCSC-SOE-18-01.pdf)) which is based on an earlier version of the paper has all links.
+
+{{< /callout >}}
diff --git a/content/publication/sevilla-lspp-14/cite.bib b/content/publication/sevilla-lspp-14/cite.bib
new file mode 100644
index 00000000000..9cc88f4e550
--- /dev/null
+++ b/content/publication/sevilla-lspp-14/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{sevilla:lspp14,
+ abstract = {Reading input from primary storage (i.e. the ingest phase) and aggregating results (i.e. the merge phase) are important pre- and post-processing steps in large batch computations. Unfortunately, today's data sets are so large that the ingest and merge job phases are now performance bottlenecks. In this paper, we mitigate the ingest and merge bottlenecks by leveraging the scale-up MapReduce model. We introduce an ingest chunk pipeline and a merge optimization that increases CPU utilization (50 - 100%) and job phase speedups (1.16x - 3.13x) for the ingest and merge phases. Our techniques are based on well-known algorithms and scale-out MapReduce optimizations, but applying them to a scale-up computation framework to mitigate the ingest and merge bottlenecks is novel.},
+ address = {Phoenix, AZ},
+ author = {Michael Sevilla and Ike Nassi and Kleoni Ioannidou and Scott Brandt and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAUUy9zZXZpbGxhLWxzcHAxNC5wZGZPEQFoAAAAAAFoAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8Sc2V2aWxsYS1sc3BwMTQucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAVMAAAIAOi86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6UzpzZXZpbGxhLWxzcHAxNC5wZGYADgAmABIAcwBlAHYAaQBsAGwAYQAtAGwAcwBwAHAAMQA0AC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAlL015IERyaXZlL1BhcGVycy9TL3NldmlsbGEtbHNwcDE0LnBkZgAAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAOwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGn},
+ booktitle = {LSPP at IPDPS 2014},
+ date-added = {2014-07-11 20:56:28 +0000},
+ date-modified = {2020-01-04 21:54:00 -0700},
+ keywords = {papers, mapreduce, sharedmemory, performance},
+ month = {May 23},
+ title = {SupMR: Circumventing Disk and Memory Bandwidth Bottlenecks for Scale-up MapReduce},
+ year = {2014}
+}
+
diff --git a/content/publication/sevilla-lspp-14/index.md b/content/publication/sevilla-lspp-14/index.md
new file mode 100644
index 00000000000..56be7c86810
--- /dev/null
+++ b/content/publication/sevilla-lspp-14/index.md
@@ -0,0 +1,12 @@
+---
+title: "SupMR: Circumventing Disk and Memory Bandwidth Bottlenecks for Scale-up MapReduce"
+date: 2014-05-01
+publishDate: 2020-01-05T06:43:50.485521Z
+authors: ["Michael Sevilla", "Ike Nassi", "Kleoni Ioannidou", "Scott Brandt", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: "Reading input from primary storage (i.e. the ingest phase) and aggregating results (i.e. the merge phase) are important pre- and post-processing steps in large batch computations. Unfortunately, today's data sets are so large that the ingest and merge job phases are now performance bottlenecks. In this paper, we mitigate the ingest and merge bottlenecks by leveraging the scale-up MapReduce model. We introduce an ingest chunk pipeline and a merge optimization that increases CPU utilization (50 - 100%) and job phase speedups (1.16x - 3.13x) for the ingest and merge phases. Our techniques are based on well-known algorithms and scale-out MapReduce optimizations, but applying them to a scale-up computation framework to mitigate the ingest and merge bottlenecks is novel."
+featured: false
+publication: "*LSPP at IPDPS 2014*"
+tags: ["papers", "mapreduce", "sharedmemory", "performance"]
+---
+
diff --git a/content/publication/sevilla-precs-18/cite.bib b/content/publication/sevilla-precs-18/cite.bib
new file mode 100644
index 00000000000..0ba0437ae08
--- /dev/null
+++ b/content/publication/sevilla-precs-18/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{sevilla:precs18,
+ abstract = {We describe the four publications we have tried to make reproducible and discuss how each paper has changed our workflows, practices, and collaboration policies. The fundamental insight is that paper artifacts must be made reproducible from the start of the project; artifacts are too difficult to make reproducible when the papers are (1) already published and (2) authored by researchers that are not thinking about reproducibility. In this paper, we present the best practices adopted by our research laboratory, which was sculpted by the pitfalls we have identified for the Popper convention. We conclude with a ``call-to-arms" for the community focused on enhancing reproducibility initiatives for academic conferences, industry environments, and national laboratories. We hope that our experiences will shape a best practices guide for future reproducible papers.},
+ address = {Tempe, AZ},
+ author = {Michael A. Sevilla and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAVUy9zZXZpbGxhLXByZWNzMTgucGRmTxEBbAAAAAABbAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////E3NldmlsbGEtcHJlY3MxOC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFTAAACADsvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOlM6c2V2aWxsYS1wcmVjczE4LnBkZgAADgAoABMAcwBlAHYAaQBsAGwAYQAtAHAAcgBlAGMAcwAxADgALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACYvTXkgRHJpdmUvUGFwZXJzL1Mvc2V2aWxsYS1wcmVjczE4LnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA8AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAaw=},
+ booktitle = {P-RECS'18},
+ date-added = {2018-06-12 17:20:57 +0000},
+ date-modified = {2020-01-04 21:32:14 -0700},
+ keywords = {papers, reproducibility, experience},
+ month = {June 11},
+ title = {Popper Pitfalls: Experiences Following a Reproducibility Convention},
+ year = {2018}
+}
+
diff --git a/content/publication/sevilla-precs-18/index.md b/content/publication/sevilla-precs-18/index.md
new file mode 100644
index 00000000000..c1c88bf308a
--- /dev/null
+++ b/content/publication/sevilla-precs-18/index.md
@@ -0,0 +1,15 @@
+---
+title: "Popper Pitfalls: Experiences Following a Reproducibility Convention"
+date: 2018-06-01
+publishDate: 2020-01-05T06:43:50.430747Z
+authors: ["Michael A. Sevilla", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: "We describe the four publications we have tried to make reproducible and discuss how each paper has changed our workflows, practices, and collaboration policies. The fundamental insight is that paper artifacts must be made reproducible from the start of the project; artifacts are too difficult to make reproducible when the papers are (1) already published and (2) authored by researchers that are not thinking about reproducibility. In this paper, we present the best practices adopted by our research laboratory, which was sculpted by the pitfalls we have identified for the Popper convention. We conclude with a ``call-to-arms\" for the community focused on enhancing reproducibility initiatives for academic conferences, industry environments, and national laboratories. We hope that our experiences will shape a best practices guide for future reproducible papers."
+featured: false
+publication: "*P-RECS'18*"
+url_poster: "https://docs.google.com/presentation/d/1ITSS5kdNyGw01k0Uk_ruMRddNKmuKgb3gNxtcd2UcBU/edit?usp=sharing"
+tags: ["papers", "reproducibility", "experience"]
+projects:
+- practical-reproducibility
+---
+
diff --git a/content/publication/sevilla-sc-15/cite.bib b/content/publication/sevilla-sc-15/cite.bib
new file mode 100644
index 00000000000..42260191f8c
--- /dev/null
+++ b/content/publication/sevilla-sc-15/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{sevilla:sc15,
+ abstract = {Migrating resources is a useful tool for balancing load in a distributed system, but it is difficult to determine when to move resources, where to move resources, and how much of them to move. We look at resource migration for file system metadata and show how CephFS's dynamic subtree partitioning approach can exploit varying degrees of locality and balance because it can partition the namespace into variable sized units. Unfortunately, the current metadata balancer is complicated and difficult to control because it struggles to address many of the general resource migration challenges inherent to the metadata management problem. To help decouple policy from mechanism, we introduce a programmable storage system that lets the designer inject custom balancing logic. We show the flexibility and transparency of this approach by replicating the strategy of a state-of-the-art metadata balancer and conclude by comparing this strategy to other custom balancers on the same system.},
+ address = {Austin, TX},
+ author = {Michael Sevilla and Noah Watkins and Carlos Maltzahn and Ike Nassi and Scott Brandt and Sage Weil and Greg Farnum and Sam Fineberg},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxASUy9zZXZpbGxhLXNjMTUucGRmTxEBYAAAAAABYAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////EHNldmlsbGEtc2MxNS5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFTAAACADgvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOlM6c2V2aWxsYS1zYzE1LnBkZgAOACIAEABzAGUAdgBpAGwAbABhAC0AcwBjADEANQAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAIy9NeSBEcml2ZS9QYXBlcnMvUy9zZXZpbGxhLXNjMTUucGRmAAATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA5AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAZ0=},
+ booktitle = {SC '15},
+ date-added = {2015-07-11 20:49:14 +0000},
+ date-modified = {2020-01-04 21:51:04 -0700},
+ keywords = {papers, metadata, management, loadbalancing, programmable, distributed, systems},
+ month = {November},
+ title = {Mantle: A Programmable Metadata Load Balancer for the Ceph File System},
+ year = {2015}
+}
+
diff --git a/content/publication/sevilla-sc-15/index.md b/content/publication/sevilla-sc-15/index.md
new file mode 100644
index 00000000000..55fa1b5f94f
--- /dev/null
+++ b/content/publication/sevilla-sc-15/index.md
@@ -0,0 +1,14 @@
+---
+title: "Mantle: A Programmable Metadata Load Balancer for the Ceph File System"
+date: 2015-11-01
+publishDate: 2020-01-05T06:43:50.468579Z
+authors: ["Michael Sevilla", "Noah Watkins", "Carlos Maltzahn", "Ike Nassi", "Scott Brandt", "Sage Weil", "Greg Farnum", "Sam Fineberg"]
+publication_types: ["1"]
+abstract: "Migrating resources is a useful tool for balancing load in a distributed system, but it is difficult to determine when to move resources, where to move resources, and how much of them to move. We look at resource migration for file system metadata and show how CephFS's dynamic subtree partitioning approach can exploit varying degrees of locality and balance because it can partition the namespace into variable sized units. Unfortunately, the current metadata balancer is complicated and difficult to control because it struggles to address many of the general resource migration challenges inherent to the metadata management problem. To help decouple policy from mechanism, we introduce a programmable storage system that lets the designer inject custom balancing logic. We show the flexibility and transparency of this approach by replicating the strategy of a state-of-the-art metadata balancer and conclude by comparing this strategy to other custom balancers on the same system."
+featured: false
+publication: "*SC '15*"
+tags: ["papers", "metadata", "management", "loadbalancing", "programmable", "distributed", "systems"]
+projects:
+- programmable-storage
+---
+
diff --git a/content/publication/sevilla-ucsctr-18/cite.bib b/content/publication/sevilla-ucsctr-18/cite.bib
new file mode 100644
index 00000000000..9269fc46358
--- /dev/null
+++ b/content/publication/sevilla-ucsctr-18/cite.bib
@@ -0,0 +1,16 @@
+@techreport{sevilla:ucsctr18,
+ address = {Santa Cruz, CA},
+ annote = {Submitted to HotStorage'18},
+ author = {Michael A. Sevilla and Reza Nasirigerdeh and Carlos Maltzahn and Jeff LeFevre and Noah Watkins and Peter Alvaro and Margaret Lawson and Jay Lofstead and Jim Pivarski},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAWUy9zZXZpbGxhLXVjc2N0cjE4LnBkZk8RAXAAAAAAAXAAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xRzZXZpbGxhLXVjc2N0cjE4LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABUwAAAgA8LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpTOnNldmlsbGEtdWNzY3RyMTgucGRmAA4AKgAUAHMAZQB2AGkAbABsAGEALQB1AGMAcwBjAHQAcgAxADgALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACcvTXkgRHJpdmUvUGFwZXJzL1Mvc2V2aWxsYS11Y3NjdHIxOC5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAD0AAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABsQ==},
+ date-added = {2018-04-08 04:09:23 +0000},
+ date-modified = {2018-04-08 04:13:07 +0000},
+ institution = {UC Santa Cruz},
+ keywords = {papers, metadata, filesystems, scalable, naming},
+ month = {April 7},
+ number = {UCSC-SOE-18-08},
+ title = {Tintenfisch: File System Namespace Schemas and Generators},
+ type = {Tech. rept.},
+ year = {2018}
+}
+
diff --git a/content/publication/sevilla-ucsctr-18/index.md b/content/publication/sevilla-ucsctr-18/index.md
new file mode 100644
index 00000000000..2a11b7069de
--- /dev/null
+++ b/content/publication/sevilla-ucsctr-18/index.md
@@ -0,0 +1,12 @@
+---
+title: "Tintenfisch: File System Namespace Schemas and Generators"
+date: 2018-04-01
+publishDate: 2020-01-05T06:43:50.436265Z
+authors: ["Michael A. Sevilla", "Reza Nasirigerdeh", "Carlos Maltzahn", "Jeff LeFevre", "Noah Watkins", "Peter Alvaro", "Margaret Lawson", "Jay Lofstead", "Jim Pivarski"]
+publication_types: ["4"]
+abstract: ""
+featured: false
+publication: ""
+tags: ["papers", "metadata", "filesystems", "scalable", "naming"]
+---
+
diff --git a/content/publication/shewmaker-icccn-16/cite.bib b/content/publication/shewmaker-icccn-16/cite.bib
new file mode 100644
index 00000000000..e268bad5009
--- /dev/null
+++ b/content/publication/shewmaker-icccn-16/cite.bib
@@ -0,0 +1,16 @@
+@inproceedings{shewmaker:icccn16,
+ abstract = {No one likes waiting in traffic, whether on a road or on a computer network. Stuttering audio, slow interactive feedback, and untimely pauses in video annoy everyone and cost businesses sales and productivity. An ideal network should (1) minimize latency, (2) maximize bandwidth, (3) share resources according to a desired policy, (4) enable incremental deployment, and (5) minimize administrative overhead. Many technologies have been developed, but none yet satisfactorily address all five goals. The best performing solutions developed so far require controlled environments where coordinated modification of multiple components in the network is possible, but they suffer poor performance in more complex scenarios.
+We present TCP Inigo, which uses independent delay-based algorithms on the sender and receiver (i.e. ambidextrously) to satisfy all five goals. In networks with single administrative domains, like those in data centers, Inigo's fairness, bandwidth, and latency indices are up to 1.3x better than the best deployable solution. When deployed in a more complex environment, such as across administrative domains, Inigo possesses latency distribution tail up to 42x better.},
+ address = {Waikoloa, HI},
+ author = {Andrew G. Shewmaker and Carlos Maltzahn and Katia Obraczka and Scott Brandt and John Bent},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAXUy9zaGV3bWFrZXItaWNjY24xNi5wZGZPEQF0AAAAAAF0AAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8Vc2hld21ha2VyLWljY2NuMTYucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAVMAAAIAPS86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6UzpzaGV3bWFrZXItaWNjY24xNi5wZGYAAA4ALAAVAHMAaABlAHcAbQBhAGsAZQByAC0AaQBjAGMAYwBuADEANgAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAKC9NeSBEcml2ZS9QYXBlcnMvUy9zaGV3bWFrZXItaWNjY24xNi5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAPgAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAG2},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAeUy9zaGV3bWFrZXItaWNjY24xNi1zbGlkZXMucGRmTxEBkAAAAAABkAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////HHNoZXdtYWtlci1pY2NjbjE2LXNsaWRlcy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFTAAACAEQvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOlM6c2hld21ha2VyLWljY2NuMTYtc2xpZGVzLnBkZgAOADoAHABzAGgAZQB3AG0AYQBrAGUAcgAtAGkAYwBjAGMAbgAxADYALQBzAGwAaQBkAGUAcwAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIALy9NeSBEcml2ZS9QYXBlcnMvUy9zaGV3bWFrZXItaWNjY24xNi1zbGlkZXMucGRmAAATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJABFAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAdk=},
+ booktitle = {25th International Conference on Computer Communications and Networks (ICCCN 2016)},
+ date-added = {2017-02-26 19:02:21 +0000},
+ date-modified = {2020-01-04 22:58:02 -0700},
+ keywords = {papers, networking, congestion, datacenter},
+ month = {August 1-4},
+ title = {TCP Inigo: Ambidextrous Congestion Control},
+ year = {2016}
+}
+
diff --git a/content/publication/shewmaker-icccn-16/index.md b/content/publication/shewmaker-icccn-16/index.md
new file mode 100644
index 00000000000..3e3fce95d0e
--- /dev/null
+++ b/content/publication/shewmaker-icccn-16/index.md
@@ -0,0 +1,13 @@
+---
+title: "TCP Inigo: Ambidextrous Congestion Control"
+date: 2016-08-01
+publishDate: 2020-01-05T06:43:50.449507Z
+authors: ["Andrew G. Shewmaker", "Carlos Maltzahn", "Katia Obraczka", "Scott Brandt", "John Bent"]
+publication_types: ["1"]
+abstract: "No one likes waiting in traffic, whether on a road or on a computer network. Stuttering audio, slow interactive feedback, and untimely pauses in video annoy everyone and cost businesses sales and productivity. An ideal network should (1) minimize latency, (2) maximize bandwidth, (3) share resources according to a desired policy, (4) enable incremental deployment, and (5) minimize administrative overhead. Many technologies have been developed, but none yet satisfactorily address all five goals. The best performing solutions developed so far require controlled environments where coordinated modification of multiple components in the network is possible, but they suffer poor performance in more complex scenarios. We present TCP Inigo, which uses independent delay-based algorithms on the sender and receiver (i.e. ambidextrously) to satisfy all five goals. In networks with single administrative domains, like those in data centers, Inigo's fairness, bandwidth, and latency indices are up to 1.3x better than the best deployable solution. When deployed in a more complex environment, such as across administrative domains, Inigo possesses latency distribution tail up to 42x better."
+featured: false
+publication: "*25th International Conference on Computer Communications and Networks (ICCCN 2016)*"
+url_slides: "https://drive.google.com/file/d/0B5rZ7hI6vXv3S3k5OTIzUVpsTkE/view?usp=sharing"
+tags: ["papers", "networking", "congestion", "datacenter"]
+---
+
diff --git a/content/publication/shewmaker-ucsctr-14/cite.bib b/content/publication/shewmaker-ucsctr-14/cite.bib
new file mode 100644
index 00000000000..fc05e25ed62
--- /dev/null
+++ b/content/publication/shewmaker-ucsctr-14/cite.bib
@@ -0,0 +1,16 @@
+@techreport{shewmaker:ucsctr14,
+ abstract = {The RUN (Reduction to UNiprocessor) [18, 19, 13] algorithm was first described by Regnier, et al. as a novel and elegant solution to real-time multiprocessor scheduling. The first practical implementation of RUN [3] created by Compagnin, et. al., both verified the simulation results and showed that it can be efficiently implemented on top of standard operating system primitives. While RUN is now the proven best solution for scheduling fixed rate tasks on multiprocessors, it can also be applied to other resources. This technical report briefly describes RUN and how it could be used in any situation involving an array of multiple resources where some form of preemptions and migrations are allowed (although must be minimized). It also describes how buffers can be sanity checked in a system where a RUN-scheduled resource is consuming data from another RUN-scheduled resource.},
+ address = {Santa Cruz, CA},
+ author = {Andrew Shewmaker and Carlos Maltzahn and Katia Obraczka and Scott Brandt},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAYUy9zaGV3bWFrZXItdWNzY3RyMTQucGRmTxEBeAAAAAABeAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////FnNoZXdtYWtlci11Y3NjdHIxNC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFTAAACAD4vOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOlM6c2hld21ha2VyLXVjc2N0cjE0LnBkZgAOAC4AFgBzAGgAZQB3AG0AYQBrAGUAcgAtAHUAYwBzAGMAdAByADEANAAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAKS9NeSBEcml2ZS9QYXBlcnMvUy9zaGV3bWFrZXItdWNzY3RyMTQucGRmAAATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA/AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbs=},
+ date-added = {2014-09-06 04:13:59 +0000},
+ date-modified = {2020-01-04 21:53:19 -0700},
+ institution = {University of California at Santa Cruz},
+ keywords = {papers, scheduling, networking, realtime, performance, management},
+ month = {July 23},
+ number = {UCSC-SOE-14-08},
+ title = {Run, Fatboy, Run: Applying the Reduction to Uniprocessor Algorithm to Other Wide Resources},
+ type = {Tech. rept.},
+ year = {2014}
+}
+
diff --git a/content/publication/shewmaker-ucsctr-14/index.md b/content/publication/shewmaker-ucsctr-14/index.md
new file mode 100644
index 00000000000..d7f2db1ab0f
--- /dev/null
+++ b/content/publication/shewmaker-ucsctr-14/index.md
@@ -0,0 +1,12 @@
+---
+title: "Run, Fatboy, Run: Applying the Reduction to Uniprocessor Algorithm to Other Wide Resources"
+date: 2014-07-01
+publishDate: 2020-01-05T06:43:50.478408Z
+authors: ["Andrew Shewmaker", "Carlos Maltzahn", "Katia Obraczka", "Scott Brandt"]
+publication_types: ["4"]
+abstract: "The RUN (Reduction to UNiprocessor) [18, 19, 13] algorithm was first described by Regnier, et al. as a novel and elegant solution to real-time multiprocessor scheduling. The first practical implementation of RUN [3] created by Compagnin, et. al., both verified the simulation results and showed that it can be efficiently implemented on top of standard operating system primitives. While RUN is now the proven best solution for scheduling fixed rate tasks on multiprocessors, it can also be applied to other resources. This technical report briefly describes RUN and how it could be used in any situation involving an array of multiple resources where some form of preemptions and migrations are allowed (although must be minimized). It also describes how buffers can be sanity checked in a system where a RUN-scheduled resource is consuming data from another RUN-scheduled resource."
+featured: false
+publication: ""
+tags: ["papers", "scheduling", "networking", "realtime", "performance", "management"]
+---
+
diff --git a/content/publication/skourtis-atc-14/cite.bib b/content/publication/skourtis-atc-14/cite.bib
new file mode 100644
index 00000000000..b5382cd86c0
--- /dev/null
+++ b/content/publication/skourtis-atc-14/cite.bib
@@ -0,0 +1,15 @@
+@inproceedings{skourtis:atc14,
+ abstract = {Modern applications and virtualization require fast and predictable storage. Hard-drives have low and unpredictable performance, while keeping everything in DRAM is still prohibitively expensive or unnecessary in many cases. Solid-state drives offer a balance between performance and cost and are becoming increasingly popular in storage systems, playing the role of large caches and permanent storage. Although their read performance is high and predictable, SSDs frequently block in the presence of writes, exceeding hard-drive latency and leading to unpredictable performance.
+Many systems with mixed workloads have low latency requirements or require predictable performance and guarantees. In such cases the performance variance of SSDs becomes a problem for both predictability and raw performance. In this paper, we propose Rails, a design based on redundancy, which provides predictable performance and low latency for reads under read/write workloads by physically separating reads from writes. More specifically, reads achieve read-only performance while writes perform at least as well as before. We evaluate our design using micro-benchmarks and real traces, illustrating the performance benefits of Rails and read/write separation in solid-state drives.},
+ address = {Philadelphia, PA},
+ author = {Dimitris Skourtis and Dimitris Achlioptas and Noah Watkins and Carlos Maltzahn and Scott Brandt},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAUUy9za291cnRpcy1hdGMxNC5wZGZPEQFoAAAAAAFoAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8Sc2tvdXJ0aXMtYXRjMTQucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAVMAAAIAOi86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6Uzpza291cnRpcy1hdGMxNC5wZGYADgAmABIAcwBrAG8AdQByAHQAaQBzAC0AYQB0AGMAMQA0AC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAlL015IERyaXZlL1BhcGVycy9TL3Nrb3VydGlzLWF0YzE0LnBkZgAAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAOwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGn},
+ booktitle = {USENIX ATC '14},
+ date-added = {2014-05-10 00:06:33 +0000},
+ date-modified = {2020-01-04 21:58:01 -0700},
+ keywords = {papers, flash, performance, redundancy, qos},
+ month = {June 19-20},
+ title = {Flash on Rails: Consistent Flash Performance through Redundancy},
+ year = {2014}
+}
+
diff --git a/content/publication/skourtis-atc-14/index.md b/content/publication/skourtis-atc-14/index.md
new file mode 100644
index 00000000000..b92d16e9044
--- /dev/null
+++ b/content/publication/skourtis-atc-14/index.md
@@ -0,0 +1,12 @@
+---
+title: "Flash on Rails: Consistent Flash Performance through Redundancy"
+date: 2014-06-01
+publishDate: 2020-01-05T06:43:50.500438Z
+authors: ["Dimitris Skourtis", "Dimitris Achlioptas", "Noah Watkins", "Carlos Maltzahn", "Scott Brandt"]
+publication_types: ["1"]
+abstract: "Modern applications and virtualization require fast and predictable storage. Hard-drives have low and unpredictable performance, while keeping everything in DRAM is still prohibitively expensive or unnecessary in many cases. Solid-state drives offer a balance between performance and cost and are becoming increasingly popular in storage systems, playing the role of large caches and permanent storage. Although their read performance is high and predictable, SSDs frequently block in the presence of writes, exceeding hard-drive latency and leading to unpredictable performance. Many systems with mixed workloads have low latency requirements or require predictable performance and guarantees. In such cases the performance variance of SSDs becomes a problem for both predictability and raw performance. In this paper, we propose Rails, a design based on redundancy, which provides predictable performance and low latency for reads under read/write workloads by physically separating reads from writes. More specifically, reads achieve read-only performance while writes perform at least as well as before. We evaluate our design using micro-benchmarks and real traces, illustrating the performance benefits of Rails and read/write separation in solid-state drives."
+featured: false
+publication: "*USENIX ATC '14*"
+tags: ["papers", "flash", "performance", "redundancy", "qos"]
+---
+
diff --git a/content/publication/skourtis-fast-13-wip/cite.bib b/content/publication/skourtis-fast-13-wip/cite.bib
new file mode 100644
index 00000000000..377d83c4c5d
--- /dev/null
+++ b/content/publication/skourtis-fast-13-wip/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{skourtis:fast13wip,
+ address = {San Jose, CA},
+ author = {Dimitris Skourtis and Scott A. Brandt and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAYUy9za291cnRpcy1mYXN0MTN3aXAucGRmTxEBeAAAAAABeAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////FnNrb3VydGlzLWZhc3QxM3dpcC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFTAAACAD4vOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOlM6c2tvdXJ0aXMtZmFzdDEzd2lwLnBkZgAOAC4AFgBzAGsAbwB1AHIAdABpAHMALQBmAGEAcwB0ADEAMwB3AGkAcAAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAKS9NeSBEcml2ZS9QYXBlcnMvUy9za291cnRpcy1mYXN0MTN3aXAucGRmAAATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA/AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbs=},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAfUy9za291cnRpcy1mYXN0MTN3aXAtcG9zdGVyLnBkZk8RAZQAAAAAAZQAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////x1za291cnRpcy1mYXN0MTN3aXAtcG9zdGVyLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABUwAAAgBFLzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpTOnNrb3VydGlzLWZhc3QxM3dpcC1wb3N0ZXIucGRmAAAOADwAHQBzAGsAbwB1AHIAdABpAHMALQBmAGEAcwB0ADEAMwB3AGkAcAAtAHAAbwBzAHQAZQByAC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAwL015IERyaXZlL1BhcGVycy9TL3Nrb3VydGlzLWZhc3QxM3dpcC1wb3N0ZXIucGRmABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAEYAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAAB3g==},
+ booktitle = {Work-in-Progress and Poster Session at the Conference on File and Storage Technology (FAST 2013)},
+ date-added = {2019-12-26 19:57:02 -0800},
+ date-modified = {2019-12-29 16:34:24 -0800},
+ keywords = {shortpapers, performance, predictable, flash, redundancy},
+ month = {February 12-15},
+ title = {High Performance & Low Latency in Solid-State Drives Through Redundancy},
+ year = {2013}
+}
+
diff --git a/content/publication/skourtis-fast-13-wip/index.md b/content/publication/skourtis-fast-13-wip/index.md
new file mode 100644
index 00000000000..1a519350b9a
--- /dev/null
+++ b/content/publication/skourtis-fast-13-wip/index.md
@@ -0,0 +1,12 @@
+---
+title: "High Performance & Low Latency in Solid-State Drives Through Redundancy"
+date: 2013-02-01
+publishDate: 2020-01-05T06:43:50.383451Z
+authors: ["Dimitris Skourtis", "Scott A. Brandt", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*Work-in-Progress and Poster Session at the Conference on File and Storage Technology (FAST 2013)*"
+tags: ["shortpapers", "performance", "predictable", "flash", "redundancy"]
+---
+
diff --git a/content/publication/skourtis-inflow-13/cite.bib b/content/publication/skourtis-inflow-13/cite.bib
new file mode 100644
index 00000000000..55b9110595e
--- /dev/null
+++ b/content/publication/skourtis-inflow-13/cite.bib
@@ -0,0 +1,15 @@
+@inproceedings{skourtis:inflow13,
+ abstract = {Solid-state drives are becoming increasingly popular in enterprise storage systems, playing the role of large caches and permanent storage. Although SSDs provide faster random access than hard-drives, their performance under read/write workloads is highly variable often exceeding that of harddrives (e.g., taking 100ms for a single read). Many systems with mixed workloads have low latency requirements, or require predictable performance and guarantees. In such cases, the performance variance of SSDs becomes a problem for both predictability and raw performance.
+In this paper, we propose a design based on redundancy, which provides high performance and low latency for reads under read/write workloads by physically separating reads from writes. More specifically, reads achieve read-only performance while writes perform at least as good as before. We evaluate our design using micro-benchmarks and real traces, illustrating the performance benefits of read/write separation in solid-state drives.},
+ address = {Farmington, PA},
+ author = {Dimitris Skourtis and Dimitris Achlioptas and Carlos Maltzahn and Scott Brandt},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAXUy9za291cnRpcy1pbmZsb3cxMy5wZGZPEQF0AAAAAAF0AAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8Vc2tvdXJ0aXMtaW5mbG93MTMucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAVMAAAIAPS86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6Uzpza291cnRpcy1pbmZsb3cxMy5wZGYAAA4ALAAVAHMAawBvAHUAcgB0AGkAcwAtAGkAbgBmAGwAbwB3ADEAMwAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAKC9NeSBEcml2ZS9QYXBlcnMvUy9za291cnRpcy1pbmZsb3cxMy5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAPgAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAG2},
+ booktitle = {INFLOW '13},
+ date-added = {2013-09-11 06:19:23 +0000},
+ date-modified = {2020-01-04 22:04:04 -0700},
+ keywords = {papers, flash, erasurecodes, redundancy, storage, distributed, systems},
+ month = {November 3},
+ title = {High Performance & Low Latency in Solid-State Drives Through Redundancy},
+ year = {2013}
+}
+
diff --git a/content/publication/skourtis-inflow-13/index.md b/content/publication/skourtis-inflow-13/index.md
new file mode 100644
index 00000000000..1a817286dc4
--- /dev/null
+++ b/content/publication/skourtis-inflow-13/index.md
@@ -0,0 +1,12 @@
+---
+title: "High Performance & Low Latency in Solid-State Drives Through Redundancy"
+date: 2013-11-01
+publishDate: 2020-01-05T06:43:50.527783Z
+authors: ["Dimitris Skourtis", "Dimitris Achlioptas", "Carlos Maltzahn", "Scott Brandt"]
+publication_types: ["1"]
+abstract: "Solid-state drives are becoming increasingly popular in enterprise storage systems, playing the role of large caches and permanent storage. Although SSDs provide faster random access than hard-drives, their performance under read/write workloads is highly variable often exceeding that of harddrives (e.g., taking 100ms for a single read). Many systems with mixed workloads have low latency requirements, or require predictable performance and guarantees. In such cases, the performance variance of SSDs becomes a problem for both predictability and raw performance. In this paper, we propose a design based on redundancy, which provides high performance and low latency for reads under read/write workloads by physically separating reads from writes. More specifically, reads achieve read-only performance while writes perform at least as good as before. We evaluate our design using micro-benchmarks and real traces, illustrating the performance benefits of read/write separation in solid-state drives."
+featured: false
+publication: "*INFLOW '13*"
+tags: ["papers", "flash", "erasurecodes", "redundancy", "storage", "distributed", "systems"]
+---
+
diff --git a/content/publication/skourtis-inflow-14/cite.bib b/content/publication/skourtis-inflow-14/cite.bib
new file mode 100644
index 00000000000..51624305572
--- /dev/null
+++ b/content/publication/skourtis-inflow-14/cite.bib
@@ -0,0 +1,15 @@
+@inproceedings{skourtis:inflow14,
+ abstract = {We want to create a scalable flash storage system that provides read/write separation and uses erasure coding to provide reliability without the storage cost of replication. Flash on Rails [19] is a system for enabling consistent performance in flash storage by physically separating reads from writes through redundancy. In principle, Rails supports erasure codes. However, it has only been evaluated using replication in small arrays, so it is currently uncertain how it would scale with erasure coding.
+In this work we consider the applicability of erasure coding in Rails, in a new system called eRails. We consider the effects of computation due to encoding/decoding on the raw performance, as well as its effect on performance consistency. We demonstrate that up to a certain number of drives the performance remains unaffected while the computation cost remains modest. After that point, the computational cost grows quickly due to coding itself making further scaling inefficient. To support an arbitrary number of drives we present a design allowing us to scale eRails by constructing overlapping erasure coding groups that preserve read/write separation. Finally, through benchmarks we demonstrate that eRails achieves read/write separation and consistent read performance under read/write workloads.},
+ address = {Broomfield, CO},
+ author = {Dimitris Skourtis and Dimitris Achlioptas and Noah Watkins and Carlos Maltzahn and Scott Brandt},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAXUy9za291cnRpcy1pbmZsb3cxNC5wZGZPEQF0AAAAAAF0AAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8Vc2tvdXJ0aXMtaW5mbG93MTQucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAVMAAAIAPS86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6Uzpza291cnRpcy1pbmZsb3cxNC5wZGYAAA4ALAAVAHMAawBvAHUAcgB0AGkAcwAtAGkAbgBmAGwAbwB3ADEANAAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAKC9NeSBEcml2ZS9QYXBlcnMvUy9za291cnRpcy1pbmZsb3cxNC5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAPgAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAG2},
+ booktitle = {INFLOW '14 (at OSDI'14)},
+ date-added = {2014-12-06 21:50:01 +0000},
+ date-modified = {2020-01-04 21:52:42 -0700},
+ keywords = {papers, erasurecodes, performance, flash, garbagecollection, predictable},
+ month = {October 5},
+ title = {Erasure Coding & Read/Write Separation in Flash Storage},
+ year = {2014}
+}
+
diff --git a/content/publication/skourtis-inflow-14/index.md b/content/publication/skourtis-inflow-14/index.md
new file mode 100644
index 00000000000..48c91cd1c45
--- /dev/null
+++ b/content/publication/skourtis-inflow-14/index.md
@@ -0,0 +1,12 @@
+---
+title: "Erasure Coding & Read/Write Separation in Flash Storage"
+date: 2014-10-01
+publishDate: 2020-01-05T06:43:50.476810Z
+authors: ["Dimitris Skourtis", "Dimitris Achlioptas", "Noah Watkins", "Carlos Maltzahn", "Scott Brandt"]
+publication_types: ["1"]
+abstract: "We want to create a scalable flash storage system that provides read/write separation and uses erasure coding to provide reliability without the storage cost of replication. Flash on Rails [19] is a system for enabling consistent performance in flash storage by physically separating reads from writes through redundancy. In principle, Rails supports erasure codes. However, it has only been evaluated using replication in small arrays, so it is currently uncertain how it would scale with erasure coding. In this work we consider the applicability of erasure coding in Rails, in a new system called eRails. We consider the effects of computation due to encoding/decoding on the raw performance, as well as its effect on performance consistency. We demonstrate that up to a certain number of drives the performance remains unaffected while the computation cost remains modest. After that point, the computational cost grows quickly due to coding itself making further scaling inefficient. To support an arbitrary number of drives we present a design allowing us to scale eRails by constructing overlapping erasure coding groups that preserve read/write separation. Finally, through benchmarks we demonstrate that eRails achieves read/write separation and consistent read performance under read/write workloads."
+featured: false
+publication: "*INFLOW '14 (at OSDI'14)*"
+tags: ["papers", "erasurecodes", "performance", "flash", "garbagecollection", "predictable"]
+---
+
diff --git a/content/publication/skourtis-ucsctr-13-a/cite.bib b/content/publication/skourtis-ucsctr-13-a/cite.bib
new file mode 100644
index 00000000000..ac3a7ba8654
--- /dev/null
+++ b/content/publication/skourtis-ucsctr-13-a/cite.bib
@@ -0,0 +1,15 @@
+@techreport{skourtis:ucsctr13a,
+ address = {Santa Cruz, CA},
+ author = {Dimitris Skourtis and Noah Watkins and Dimitris Achlioptas and Carlos Maltzahn and Scott Brandt},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAYUy9za291cnRpcy11Y3NjdHIxM2EucGRmTxEBeAAAAAABeAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////FnNrb3VydGlzLXVjc2N0cjEzYS5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFTAAACAD4vOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOlM6c2tvdXJ0aXMtdWNzY3RyMTNhLnBkZgAOAC4AFgBzAGsAbwB1AHIAdABpAHMALQB1AGMAcwBjAHQAcgAxADMAYQAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAKS9NeSBEcml2ZS9QYXBlcnMvUy9za291cnRpcy11Y3NjdHIxM2EucGRmAAATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA/AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbs=},
+ date-added = {2013-07-18 18:38:34 +0000},
+ date-modified = {2013-07-19 05:52:07 +0000},
+ institution = {UCSC},
+ keywords = {papers, flash, cluster, redundancy, performance, management, qos, parallel, filesystems},
+ month = {July 18},
+ number = {UCSC-SOE-13-10},
+ title = {Latency Minimization in SSD Clusters for Free},
+ type = {Tech. rept.},
+ year = {2013}
+}
+
diff --git a/content/publication/skourtis-ucsctr-13-a/index.md b/content/publication/skourtis-ucsctr-13-a/index.md
new file mode 100644
index 00000000000..4b4f74445a9
--- /dev/null
+++ b/content/publication/skourtis-ucsctr-13-a/index.md
@@ -0,0 +1,12 @@
+---
+title: "Latency Minimization in SSD Clusters for Free"
+date: 2013-07-01
+publishDate: 2020-01-05T06:43:50.534745Z
+authors: ["Dimitris Skourtis", "Noah Watkins", "Dimitris Achlioptas", "Carlos Maltzahn", "Scott Brandt"]
+publication_types: ["4"]
+abstract: ""
+featured: false
+publication: ""
+tags: ["papers", "flash", "cluster", "redundancy", "performance", "management", "qos", "parallel", "filesystems"]
+---
+
diff --git a/content/publication/skourtis-ucsctr-13/cite.bib b/content/publication/skourtis-ucsctr-13/cite.bib
new file mode 100644
index 00000000000..1c83fff6f37
--- /dev/null
+++ b/content/publication/skourtis-ucsctr-13/cite.bib
@@ -0,0 +1,16 @@
+@techreport{skourtis:ucsctr13,
+ address = {Santa Cruz, CA},
+ author = {Dimitris Skourtis and Scott A. Brandt and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAXUy9za291cnRpcy11Y3NjdHIxMy5wZGZPEQF0AAAAAAF0AAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8Vc2tvdXJ0aXMtdWNzY3RyMTMucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAVMAAAIAPS86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6Uzpza291cnRpcy11Y3NjdHIxMy5wZGYAAA4ALAAVAHMAawBvAHUAcgB0AGkAcwAtAHUAYwBzAGMAdAByADEAMwAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAKC9NeSBEcml2ZS9QYXBlcnMvUy9za291cnRpcy11Y3NjdHIxMy5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAPgAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAG2},
+ bdsk-url-1 = {http://www.soe.ucsc.edu/research/technical-reports/ucsc-soe-13-08},
+ date-added = {2013-07-17 23:54:42 +0000},
+ date-modified = {2013-07-17 23:58:48 +0000},
+ institution = {UCSC},
+ keywords = {papers, flash, performance, management, qos},
+ month = {May 14},
+ number = {UCSC-SOE-13-08},
+ title = {Ianus: Guaranteeing High Performance in Solid-State Drives},
+ type = {Tech. rept.},
+ year = {2013}
+}
+
diff --git a/content/publication/skourtis-ucsctr-13/index.md b/content/publication/skourtis-ucsctr-13/index.md
new file mode 100644
index 00000000000..3a9757ee102
--- /dev/null
+++ b/content/publication/skourtis-ucsctr-13/index.md
@@ -0,0 +1,12 @@
+---
+title: "Ianus: Guaranteeing High Performance in Solid-State Drives"
+date: 2013-05-01
+publishDate: 2020-01-05T06:43:50.536704Z
+authors: ["Dimitris Skourtis", "Scott A. Brandt", "Carlos Maltzahn"]
+publication_types: ["4"]
+abstract: ""
+featured: false
+publication: ""
+tags: ["papers", "flash", "performance", "management", "qos"]
+---
+
diff --git a/content/publication/sloan-ucospo-22/cite.bib b/content/publication/sloan-ucospo-22/cite.bib
new file mode 100644
index 00000000000..f3b6ad77e0f
--- /dev/null
+++ b/content/publication/sloan-ucospo-22/cite.bib
@@ -0,0 +1,11 @@
+@unpublished{sloan:ucospo22,
+ author = {Alfred P. Sloan Foundation -- Better Software for Science Program},
+ date-added = {2022-08-04 06:46:49 -0700},
+ date-modified = {2022-08-04 06:49:20 -0700},
+ keywords = {funding},
+ month = {January},
+ note = {Available at sloan.org/grant-detail/9723},
+ title = {To pilot a postdoctoral fellowship on open source software development and support other activities at the University of California Santa Cruz Open Source Program Office},
+ year = {2022}
+}
+
diff --git a/content/publication/storer-sisw-05/cite.bib b/content/publication/storer-sisw-05/cite.bib
new file mode 100644
index 00000000000..6ae06c26d17
--- /dev/null
+++ b/content/publication/storer-sisw-05/cite.bib
@@ -0,0 +1,12 @@
+@inproceedings{storer:sisw05,
+ author = {Mark Storer and Kevin Greenan and Ethan L. Miller and Carlos Maltzahn},
+ bdsk-url-1 = {http://www.ssrc.ucsc.edu/Papers/storer-sisw05.pdf},
+ booktitle = {Proceedings of the 3rd International IEEE Security in Storage Workshop},
+ date-added = {2009-09-29 12:08:09 -0700},
+ date-modified = {2009-12-14 11:54:42 -0800},
+ local-url = {/Users/carlosmalt/Documents/Papers/storer-sisw05.pdf},
+ month = {December},
+ title = {POTSHARDS: Storing Data for the Long-term Without Encryption},
+ year = {2005}
+}
+
diff --git a/content/publication/storer-sisw-05/index.md b/content/publication/storer-sisw-05/index.md
new file mode 100644
index 00000000000..c6e7094670e
--- /dev/null
+++ b/content/publication/storer-sisw-05/index.md
@@ -0,0 +1,11 @@
+---
+title: "POTSHARDS: Storing Data for the Long-term Without Encryption"
+date: 2005-12-01
+publishDate: 2020-01-05T06:43:50.657149Z
+authors: ["Mark Storer", "Kevin Greenan", "Ethan L. Miller", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*Proceedings of the 3rd International IEEE Security in Storage Workshop*"
+---
+
diff --git a/content/publication/ulmer-compsys-23/cite.bib b/content/publication/ulmer-compsys-23/cite.bib
new file mode 100644
index 00000000000..b093f2df6d2
--- /dev/null
+++ b/content/publication/ulmer-compsys-23/cite.bib
@@ -0,0 +1,13 @@
+@inproceedings{ulmer:compsys23,
+ address = {St. Petersburg, FL, USA},
+ author = {Craig Ulmer and Jianshen Liu and Carlos Maltzahn and Matthew L. Curry},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAwLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1UtVi91bG1lci1jb21wc3lzMjMucGRmTxEBcgAAAAABcgACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4An/aUJEAAH/////E3VsbWVyLWNvbXBzeXMyMy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////gN9l2AAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAANVLVYAAAIAOi86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6VS1WOnVsbWVyLWNvbXBzeXMyMy5wZGYADgAoABMAdQBsAG0AZQByAC0AYwBvAG0AcABzAHkAcwAyADMALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADhVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9VLVYvdWxtZXItY29tcHN5czIzLnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABXAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAc0=},
+ booktitle = {2nd Workshop on Composable Systems (COMPSYS 2023, co-located with IPDPS 2023)},
+ date-added = {2023-03-09 10:29:28 -0800},
+ date-modified = {2023-03-09 10:30:50 -0800},
+ keywords = {smartnics, composability, datamanagement},
+ month = {May 15-19},
+ title = {Extending Composable Data Services into SmartNICS (Best Paper Award)},
+ year = {2023}
+}
+
diff --git a/content/publication/ulmer-compsys-23/index.md b/content/publication/ulmer-compsys-23/index.md
new file mode 100644
index 00000000000..25ae9cbdb7f
--- /dev/null
+++ b/content/publication/ulmer-compsys-23/index.md
@@ -0,0 +1,16 @@
+---
+title: "Extending Composable Data Services into SmartNICS (Best Paper Award)"
+date: 2023-05-01
+publishDate: 2023-06-08T01:20:50.634899Z
+authors: ["Craig Ulmer", "Jianshen Liu", "Carlos Maltzahn", "Matthew L. Curry"]
+publication_types: ["1"]
+abstract: "Advanced scientific-computing workflows rely on composable data services to migrate data between simulation and analysis jobs that run in parallel on high-performance computing (HPC) platforms. Unfortunately, these services consume compute-node memory and processing resources that could otherwise be used to complete the workflow’s tasks. The emergence of programmable network interface cards, or SmartNICs, presents an opportunity to host data services in an isolated space within a compute node that does not impact host resources. In this paper we explore extending data services into SmartNICs and describe a software stack for services that uses Faodel and Apache Arrow. To illustrate how this stack operates, we present a case study that implements a distributed, particle-sifting service for reorganizing simulation results. Performance experiments from a 100-node cluster equipped with 100Gb/s BlueField-2 SmartNICs indicate that current SmartNICs can perform useful data management tasks, albeit at a lower throughput than hosts."
+featured: false
+publication: "*2nd Workshop on Composable Systems (COMPSYS 2023, co-located with IPDPS 2023)*"
+tags: ["smartnics", "composability", "datamanagement"]
+projects:
+ - smartnic
+ - eusocial-storage
+ - skyhook
+---
+
diff --git a/content/publication/uta-nsdi-20/cite.bib b/content/publication/uta-nsdi-20/cite.bib
new file mode 100644
index 00000000000..f4dbaf9df94
--- /dev/null
+++ b/content/publication/uta-nsdi-20/cite.bib
@@ -0,0 +1,13 @@
+@inproceedings{uta:nsdi20,
+ abstract = {Performance variability has been acknowledged as a problem for over a decade by cloud practitioners and performance engineers. Yet, our survey of top systems conferences reveals that the research community regularly disregards variability when running experiments in the cloud. Focusing on networks, we assess the impact of variability on cloud-based big-data workloads by gathering traces from mainstream commercial clouds and private research clouds. Our data collection consists of millions of datapoints gathered while transferring over 9 petabytes of data. We characterize the network variability present in our data and show that, even though commercial cloud providers implement mechanisms for quality-of-service enforcement, variability still occurs, and is even exacerbated by such mechanisms and service provider policies. We show how big-data workloads suffer from significant slowdowns and lack predictability and replicability, even when state-of-the-art experimentation techniques are used. We provide guidelines for practitioners to reduce the volatility of big data performance, making experiments more repeatable.},
+ address = {Santa Clara, CA},
+ author = {Alexandru Uta and Alexandru Custura and Dmitry Duplyakin and Ivo Jimenez and Jan Rellermeyer and Carlos Maltzahn and Robert Ricci and Alexandru Iosup},
+ booktitle = {NSDI '20},
+ date-added = {2019-12-26 15:33:24 -0800},
+ date-modified = {2020-01-04 21:25:56 -0700},
+ keywords = {papers, reproducibility, datacenter, performance},
+ month = {February 25-27},
+ title = {Is Big Data Performance Reproducible in Modern Cloud Networks?},
+ year = {2020}
+}
+
diff --git a/content/publication/uta-nsdi-20/index.md b/content/publication/uta-nsdi-20/index.md
new file mode 100644
index 00000000000..e326d0ac7bc
--- /dev/null
+++ b/content/publication/uta-nsdi-20/index.md
@@ -0,0 +1,15 @@
+---
+title: "Is Big Data Performance Reproducible in Modern Cloud Networks?"
+date: 2020-02-01
+publishDate: 2020-01-05T06:43:50.413428Z
+authors: ["Alexandru Uta", "Alexandru Custura", "Dmitry Duplyakin", "Ivo Jimenez", "Jan Rellermeyer", "Carlos Maltzahn", "Robert Ricci", "Alexandru Iosup"]
+publication_types: ["1"]
+abstract: "Performance variability has been acknowledged as a problem for over a decade by cloud practitioners and performance engineers. Yet, our survey of top systems conferences reveals that the research community regularly disregards variability when running experiments in the cloud. Focusing on networks, we assess the impact of variability on cloud-based big-data workloads by gathering traces from mainstream commercial clouds and private research clouds. Our data collection consists of millions of datapoints gathered while transferring over 9 petabytes of data. We characterize the network variability present in our data and show that, even though commercial cloud providers implement mechanisms for quality-of-service enforcement, variability still occurs, and is even exacerbated by such mechanisms and service provider policies. We show how big-data workloads suffer from significant slowdowns and lack predictability and replicability, even when state-of-the-art experimentation techniques are used. We provide guidelines for practitioners to reduce the volatility of big data performance, making experiments more repeatable."
+featured: false
+url_slides: https://drive.google.com/file/d/1NL7fQUEflV5JJwwgbg-9ZWVFzLlJhSUi/view?usp=sharing
+publication: "*NSDI '20*"
+tags: ["papers", "reproducibility", "datacenter", "performance"]
+projects:
+- practical-reproducibility
+---
+
diff --git a/content/publication/wacha-eurosys-10/cite.bib b/content/publication/wacha-eurosys-10/cite.bib
new file mode 100644
index 00000000000..969560fe2a6
--- /dev/null
+++ b/content/publication/wacha-eurosys-10/cite.bib
@@ -0,0 +1,13 @@
+@inproceedings{wacha:eurosys10,
+ address = {Paris, France},
+ author = {Rosie Wacha and Scott A. Brandt and John Bent and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAbVy93YWNoYS1ldXJvc3lzMTBwb3N0ZXIucGRmTxEBhAAAAAABhAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////GXdhY2hhLWV1cm9zeXMxMHBvc3Rlci5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFXAAACAEEvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOlc6d2FjaGEtZXVyb3N5czEwcG9zdGVyLnBkZgAADgA0ABkAdwBhAGMAaABhAC0AZQB1AHIAbwBzAHkAcwAxADAAcABvAHMAdABlAHIALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACwvTXkgRHJpdmUvUGFwZXJzL1cvd2FjaGEtZXVyb3N5czEwcG9zdGVyLnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJABCAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAco=},
+ booktitle = {Poster Session and Ph.D. Workshop at EuroSys 2010},
+ date-added = {2011-05-26 23:29:21 -0700},
+ date-modified = {2020-01-05 05:37:15 -0700},
+ keywords = {shortpapers, raid, flash},
+ month = {April 13-16},
+ title = {RAID4S: Adding SSDs to RAID Arrays},
+ year = {2010}
+}
+
diff --git a/content/publication/wacha-eurosys-10/index.md b/content/publication/wacha-eurosys-10/index.md
new file mode 100644
index 00000000000..ea28ebccdbb
--- /dev/null
+++ b/content/publication/wacha-eurosys-10/index.md
@@ -0,0 +1,12 @@
+---
+title: "RAID4S: Adding SSDs to RAID Arrays"
+date: 2010-04-01
+publishDate: 2020-01-05T12:39:43.061147Z
+authors: ["Rosie Wacha", "Scott A. Brandt", "John Bent", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*Poster Session and Ph.D. Workshop at EuroSys 2010*"
+tags: ["shortpapers", "raid", "flash"]
+---
+
diff --git a/content/publication/wacha-fast-10-poster/cite.bib b/content/publication/wacha-fast-10-poster/cite.bib
new file mode 100644
index 00000000000..b9d26e66118
--- /dev/null
+++ b/content/publication/wacha-fast-10-poster/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{wacha:fast10poster,
+ address = {San Jose, CA},
+ author = {Rosie Wacha and Scott A. Brandt and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAYVy93YWNoYS1mYXN0MTBwb3N0ZXIucGRmTxEBeAAAAAABeAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////FndhY2hhLWZhc3QxMHBvc3Rlci5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFXAAACAD4vOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOlc6d2FjaGEtZmFzdDEwcG9zdGVyLnBkZgAOAC4AFgB3AGEAYwBoAGEALQBmAGEAcwB0ADEAMABwAG8AcwB0AGUAcgAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAKS9NeSBEcml2ZS9QYXBlcnMvVy93YWNoYS1mYXN0MTBwb3N0ZXIucGRmAAATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA/AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbs=},
+ bdsk-url-1 = {http://users.soe.ucsc.edu/~carlosm/Papers/S11.pdf},
+ booktitle = {In Poster Session at the Conference on File and Storage Technology (FAST 2010)},
+ date-added = {2019-12-27 10:40:59 -0800},
+ date-modified = {2019-12-27 10:43:18 -0800},
+ keywords = {shortpapers, flash, RAID},
+ month = {February 24-27},
+ title = {RAID4S: Adding SSDs to RAID Arrays},
+ year = {2010}
+}
+
diff --git a/content/publication/wacha-fast-10-poster/index.md b/content/publication/wacha-fast-10-poster/index.md
new file mode 100644
index 00000000000..f44c685773a
--- /dev/null
+++ b/content/publication/wacha-fast-10-poster/index.md
@@ -0,0 +1,12 @@
+---
+title: "RAID4S: Adding SSDs to RAID Arrays"
+date: 2010-02-01
+publishDate: 2020-01-05T06:43:50.379106Z
+authors: ["Rosie Wacha", "Scott A. Brandt", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*In Poster Session at the Conference on File and Storage Technology (FAST 2010)*"
+tags: ["shortpapers", "flash", "RAID"]
+---
+
diff --git a/content/publication/watkins-bdmc-13/cite.bib b/content/publication/watkins-bdmc-13/cite.bib
new file mode 100644
index 00000000000..2b06fd3e0cc
--- /dev/null
+++ b/content/publication/watkins-bdmc-13/cite.bib
@@ -0,0 +1,15 @@
+@inproceedings{watkins:bdmc13,
+ abstract = {The emergence of high-performance open-source storage systems is allowing application and middleware developers to consider non-standard storage system interfaces. In contrast to the practice of virtually always designing for file-like byte-stream interfaces, co-designed domain-specific storage system interfaces are becoming increasingly common. However, in order for developers to evolve interfaces in high-availability storage systems, services are needed for in-vivo interface evolution that allows the development of interfaces in the context of a live system. Current clustered storage systems that provide interface customizability expose primitive services for managing ad-hoc interfaces. For maximum utility, the ability to create, evolve, and deploy dynamic storage interfaces is needed. However, in large-scale clusters, dynamic interface instantiation will require system-level support that ensures interface version consistency among storage nodes and client applications. We propose that storage systems should provide services that fully manage the life-cycle of dynamic interfaces that are aligned with the common branch-and-merge form of software maintenance, including isolated development workspaces that can be combined into existing production views of the system.},
+ address = {Aachen, Germany},
+ author = {Noah Watkins and Carlos Maltzahn and Scott Brandt and Ian Pye and Adam Manzanares},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAUVy93YXRraW5zLWJkbWMxMy5wZGZPEQFoAAAAAAFoAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8Sd2F0a2lucy1iZG1jMTMucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAVcAAAIAOi86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6Vzp3YXRraW5zLWJkbWMxMy5wZGYADgAmABIAdwBhAHQAawBpAG4AcwAtAGIAZABtAGMAMQAzAC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAlL015IERyaXZlL1BhcGVycy9XL3dhdGtpbnMtYmRtYzEzLnBkZgAAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAOwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGn},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAbVy93YXRraW5zLWJkbWMxMy1zbGlkZXMucGRmTxEBhAAAAAABhAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////GXdhdGtpbnMtYmRtYzEzLXNsaWRlcy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFXAAACAEEvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOlc6d2F0a2lucy1iZG1jMTMtc2xpZGVzLnBkZgAADgA0ABkAdwBhAHQAawBpAG4AcwAtAGIAZABtAGMAMQAzAC0AcwBsAGkAZABlAHMALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACwvTXkgRHJpdmUvUGFwZXJzL1cvd2F0a2lucy1iZG1jMTMtc2xpZGVzLnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJABCAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAco=},
+ booktitle = {BigDataCloud '13 (in conjunction with EuroPar 2013)},
+ date-added = {2013-07-21 00:37:45 +0000},
+ date-modified = {2020-01-04 23:18:47 -0700},
+ keywords = {papers, datamodel, scripting, storage, systems, software-defined},
+ month = {August 26},
+ title = {In-Vivo Storage System Development},
+ year = {2013}
+}
+
diff --git a/content/publication/watkins-bdmc-13/index.md b/content/publication/watkins-bdmc-13/index.md
new file mode 100644
index 00000000000..8b1a366afb4
--- /dev/null
+++ b/content/publication/watkins-bdmc-13/index.md
@@ -0,0 +1,14 @@
+---
+title: "In-Vivo Storage System Development"
+date: 2013-08-01
+publishDate: 2020-01-05T06:43:50.530355Z
+authors: ["Noah Watkins", "Carlos Maltzahn", "Scott Brandt", "Ian Pye", "Adam Manzanares"]
+publication_types: ["1"]
+abstract: "The emergence of high-performance open-source storage systems is allowing application and middleware developers to consider non-standard storage system interfaces. In contrast to the practice of virtually always designing for file-like byte-stream interfaces, co-designed domain-specific storage system interfaces are becoming increasingly common. However, in order for developers to evolve interfaces in high-availability storage systems, services are needed for in-vivo interface evolution that allows the development of interfaces in the context of a live system. Current clustered storage systems that provide interface customizability expose primitive services for managing ad-hoc interfaces. For maximum utility, the ability to create, evolve, and deploy dynamic storage interfaces is needed. However, in large-scale clusters, dynamic interface instantiation will require system-level support that ensures interface version consistency among storage nodes and client applications. We propose that storage systems should provide services that fully manage the life-cycle of dynamic interfaces that are aligned with the common branch-and-merge form of software maintenance, including isolated development workspaces that can be combined into existing production views of the system."
+featured: false
+publication: "*BigDataCloud '13 (in conjunction with EuroPar 2013)*"
+tags: ["papers", "datamodel", "scripting", "storage", "systems", "software-defined"]
+projects:
+- programmable-storage
+---
+
diff --git a/content/publication/watkins-hotstorage-17/cite.bib b/content/publication/watkins-hotstorage-17/cite.bib
new file mode 100644
index 00000000000..08d063ea0a8
--- /dev/null
+++ b/content/publication/watkins-hotstorage-17/cite.bib
@@ -0,0 +1,14 @@
+@inproceedings{watkins:hotstorage17,
+ address = {Santa Clara, CA},
+ author = {Noah Watkins and Michael A. Sevilla and Ivo Jimenez and Kathryn Dahlgren and Peter Alvaro and Shel Finkelstein and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAaVy93YXRraW5zLWhvdHN0b3JhZ2UxNy5wZGZPEQGAAAAAAAGAAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8Yd2F0a2lucy1ob3RzdG9yYWdlMTcucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAVcAAAIAQC86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6Vzp3YXRraW5zLWhvdHN0b3JhZ2UxNy5wZGYADgAyABgAdwBhAHQAawBpAG4AcwAtAGgAbwB0AHMAdABvAHIAYQBnAGUAMQA3AC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgArL015IERyaXZlL1BhcGVycy9XL3dhdGtpbnMtaG90c3RvcmFnZTE3LnBkZgAAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAQQAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHF},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAhVy93YXRraW5zLWhvdHN0b3JhZ2UxNy1zbGlkZXMucGRmTxEBnAAAAAABnAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////H3dhdGtpbnMtaG90c3RvcmFnZTE3LXNsaWRlcy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFXAAACAEcvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOlc6d2F0a2lucy1ob3RzdG9yYWdlMTctc2xpZGVzLnBkZgAADgBAAB8AdwBhAHQAawBpAG4AcwAtAGgAbwB0AHMAdABvAHIAYQBnAGUAMQA3AC0AcwBsAGkAZABlAHMALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASADIvTXkgRHJpdmUvUGFwZXJzL1cvd2F0a2lucy1ob3RzdG9yYWdlMTctc2xpZGVzLnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJABIAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAeg=},
+ booktitle = {HotStorage '17},
+ date-added = {2017-05-20 22:54:48 +0000},
+ date-modified = {2017-05-20 22:58:57 +0000},
+ keywords = {papers, storage, systems, declarative, distributed, programmable},
+ month = {July 10-11},
+ title = {DeclStore: Layering is for the Faint of Heart},
+ year = {2017}
+}
+
diff --git a/content/publication/watkins-hotstorage-17/index.md b/content/publication/watkins-hotstorage-17/index.md
new file mode 100644
index 00000000000..85e2ddfbf15
--- /dev/null
+++ b/content/publication/watkins-hotstorage-17/index.md
@@ -0,0 +1,19 @@
+---
+title: "DeclStore: Layering is for the Faint of Heart"
+date: 2017-07-01
+publishDate: 2020-01-05T06:43:50.446394Z
+authors: ["Noah Watkins", "Michael A. Sevilla", "Ivo Jimenez", "Kathryn Dahlgren", "Peter Alvaro", "Shel Finkelstein", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: "Popular storage systems support diverse storage abstractions by providing important disaggregation benefits. Instead of maintaining a separate system for each abstraction, unified storage systems, in particular, support standard file, block, and object abstractions so the same hardware can be used for a wider range and a more flexible mix of applications. As large-scale unified storage systems continue to evolve to meet the requirements of an increasingly diverse set of applications and next-generation hardware, de jure approaches of the past—based on standardized interfaces—are giving way to domain-specific interfaces and optimizations. While promising, the ad-hoc strategies characteristic of current approaches to co-design are untenable.
+
+
+The standardization of the POSIX I/O interface has been a major success. General adoption has allowed application developers to avoid vendor lock-in and encourages storage system designers to innovate independently. However, large-scale storage systems are generally dominated by proprietary offerings, preventing exploration of alternative interfaces when the need has presented itself. An increase in the number of special-purpose storage systems characterizes recent history in the field, including the emergence of high-performance, and highly modifiable, open-source storage systems, which enable system changes without fear of vendor lock-in. Unfortunately, evolving storage system interfaces is a challenging task requiring domain expertise, and is predicated on the willingness of programmers to forfeit the protection from change afforded by narrow interfaces."
+featured: false
+publication: "*HotStorage '17*"
+tags: ["papers", "storage", "systems", "declarative", "distributed", "programmable"]
+url_slides: https://drive.google.com/file/d/0B5rZ7hI6vXv3b1Y0bGdMaDNXRFk/view?usp=sharing
+projects:
+- declstore
+- programmable-storage
+---
+
diff --git a/content/publication/watkins-pdsw-12/cite.bib b/content/publication/watkins-pdsw-12/cite.bib
new file mode 100644
index 00000000000..8db7ecf0e69
--- /dev/null
+++ b/content/publication/watkins-pdsw-12/cite.bib
@@ -0,0 +1,17 @@
+@inproceedings{watkins:pdsw12,
+ abstract = {As applications become more complex, and the level of concurrency in systems continue to rise, developers are struggling to scale complex data models on top of a traditional byte stream interface. Middleware tailored for specific data models is a common approach to dealing with these challenges, but middleware commonly reproduces scalable services already present in many distributed file systems.
+We present DataMods, an abstraction over existing services found in large-scale storage systems that allows middleware to take advantage of existing, highly tuned services. Specifically, DataMods provides an abstraction for extending storage system services in order to implement native, domain-specific data models and interfaces throughout the storage hierarchy.},
+ address = {Salt Lake City, UT},
+ author = {Noah Watkins and Carlos Maltzahn and Scott A. Brandt and Adam Manzanares},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAUVy93YXRraW5zLXBkc3cxMi5wZGZPEQFoAAAAAAFoAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8Sd2F0a2lucy1wZHN3MTIucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAVcAAAIAOi86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6Vzp3YXRraW5zLXBkc3cxMi5wZGYADgAmABIAdwBhAHQAawBpAG4AcwAtAHAAZABzAHcAMQAyAC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAlL015IERyaXZlL1BhcGVycy9XL3dhdGtpbnMtcGRzdzEyLnBkZgAAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAOwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGn},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAbVy93YXRraW5zLXBkc3cxMi1zbGlkZXMucGRmTxEBhAAAAAABhAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////GXdhdGtpbnMtcGRzdzEyLXNsaWRlcy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFXAAACAEEvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOlc6d2F0a2lucy1wZHN3MTItc2xpZGVzLnBkZgAADgA0ABkAdwBhAHQAawBpAG4AcwAtAHAAZABzAHcAMQAyAC0AcwBsAGkAZABlAHMALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACwvTXkgRHJpdmUvUGFwZXJzL1cvd2F0a2lucy1wZHN3MTItc2xpZGVzLnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJABCAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAco=},
+ booktitle = {PDSW'12},
+ date-added = {2012-11-02 06:03:40 +0000},
+ date-modified = {2020-01-05 05:27:34 -0700},
+ keywords = {papers, filesystems, programming, datamanagement},
+ month = {November 12},
+ read = {1},
+ title = {DataMods: Programmable File System Services},
+ year = {2012}
+}
+
diff --git a/content/publication/watkins-pdsw-12/index.md b/content/publication/watkins-pdsw-12/index.md
new file mode 100644
index 00000000000..1be3a09d62d
--- /dev/null
+++ b/content/publication/watkins-pdsw-12/index.md
@@ -0,0 +1,14 @@
+---
+title: "DataMods: Programmable File System Services"
+date: 2012-11-01
+publishDate: 2020-01-05T12:39:43.036011Z
+authors: ["Noah Watkins", "Carlos Maltzahn", "Scott A. Brandt", "Adam Manzanares"]
+publication_types: ["1"]
+abstract: "As applications become more complex, and the level of concurrency in systems continue to rise, developers are struggling to scale complex data models on top of a traditional byte stream interface. Middleware tailored for specific data models is a common approach to dealing with these challenges, but middleware commonly reproduces scalable services already present in many distributed file systems. We present DataMods, an abstraction over existing services found in large-scale storage systems that allows middleware to take advantage of existing, highly tuned services. Specifically, DataMods provides an abstraction for extending storage system services in order to implement native, domain-specific data models and interfaces throughout the storage hierarchy."
+featured: false
+publication: "*PDSW'12*"
+tags: ["papers", "filesystems", "programming", "datamanagement"]
+projects:
+- programmable-storage
+---
+
diff --git a/content/publication/watkins-pdsw-15/cite.bib b/content/publication/watkins-pdsw-15/cite.bib
new file mode 100644
index 00000000000..1f18ac85127
--- /dev/null
+++ b/content/publication/watkins-pdsw-15/cite.bib
@@ -0,0 +1,15 @@
+@inproceedings{watkins:pdsw15,
+ abstract = {Traditionally storage has not been part of a programming model's semantics and is added only as an I/O library interface. As a result, programming models, languages, and storage systems are limited in the optimizations they can perform for I/O operations, as the semantics of the I/O library is typically at the level of transfers of blocks of uninterpreted bits, with no accompanying knowledge of how those bits are used by the application. For many HPC applications where I/O operations for analyzing and checkpointing large data sets are a non-negligible portion of the overall execution time, such a ``know nothing'' I/O design has negative performance implications.
+We propose an alternative design where the I/O semantics are integrated as part of the programming model, and a common data model is used throughout the entire memory and storage hierarchy enabling storage and application level co-optimizations. We demonstrate these ideas through the integration of storage services within the Legion [2] runtime and present preliminary results demonstrating the integration.},
+ address = {Austin, TX},
+ author = {Noah Watkins and Zhihao Jia and Galen Shipman and Carlos Maltzahn and Alex Aiken and Pat McCormick},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAUVy93YXRraW5zLXBkc3cxNS5wZGZPEQFoAAAAAAFoAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8Sd2F0a2lucy1wZHN3MTUucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAVcAAAIAOi86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6Vzp3YXRraW5zLXBkc3cxNS5wZGYADgAmABIAdwBhAHQAawBpAG4AcwAtAHAAZABzAHcAMQA1AC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAlL015IERyaXZlL1BhcGVycy9XL3dhdGtpbnMtcGRzdzE1LnBkZgAAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAOwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGn},
+ booktitle = {PDSW'15},
+ date-added = {2016-08-31 06:03:13 +0000},
+ date-modified = {2020-01-04 21:48:24 -0700},
+ keywords = {papers, storage, systems, optimization, parallel, distributed, runtime},
+ month = {November 16},
+ title = {Automatic and transparent I/O optimization with storage integrated application runtime support},
+ year = {2015}
+}
+
diff --git a/content/publication/watkins-pdsw-15/index.md b/content/publication/watkins-pdsw-15/index.md
new file mode 100644
index 00000000000..30e6f7baa00
--- /dev/null
+++ b/content/publication/watkins-pdsw-15/index.md
@@ -0,0 +1,14 @@
+---
+title: "Automatic and transparent I/O optimization with storage integrated application runtime support"
+date: 2015-11-01
+publishDate: 2020-01-05T06:43:50.457995Z
+authors: ["Noah Watkins", "Zhihao Jia", "Galen Shipman", "Carlos Maltzahn", "Alex Aiken", "Pat McCormick"]
+publication_types: ["1"]
+abstract: "Traditionally storage has not been part of a programming model's semantics and is added only as an I/O library interface. As a result, programming models, languages, and storage systems are limited in the optimizations they can perform for I/O operations, as the semantics of the I/O library is typically at the level of transfers of blocks of uninterpreted bits, with no accompanying knowledge of how those bits are used by the application. For many HPC applications where I/O operations for analyzing and checkpointing large data sets are a non-negligible portion of the overall execution time, such a ``know nothing'' I/O design has negative performance implications. We propose an alternative design where the I/O semantics are integrated as part of the programming model, and a common data model is used throughout the entire memory and storage hierarchy enabling storage and application level co-optimizations. We demonstrate these ideas through the integration of storage services within the Legion [2] runtime and present preliminary results demonstrating the integration."
+featured: false
+publication: "*PDSW'15*"
+tags: ["papers", "storage", "systems", "optimization", "parallel", "distributed", "runtime"]
+projects:
+- programmable-storage
+---
+
diff --git a/content/publication/watkins-socc-16-poster/cite.bib b/content/publication/watkins-socc-16-poster/cite.bib
new file mode 100644
index 00000000000..cc4d154fb5d
--- /dev/null
+++ b/content/publication/watkins-socc-16-poster/cite.bib
@@ -0,0 +1,13 @@
+@inproceedings{watkins:socc16-poster,
+ address = {Santa Clara, CA},
+ author = {Noah Watkins and Michael Sevilla and Ivo Jimenez and Neha Ohja and Peter Alvaro and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAbVy93YXRraW5zLXNvY2MxNi1wb3N0ZXIucGRmTxEBhAAAAAABhAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////GXdhdGtpbnMtc29jYzE2LXBvc3Rlci5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFXAAACAEEvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOlc6d2F0a2lucy1zb2NjMTYtcG9zdGVyLnBkZgAADgA0ABkAdwBhAHQAawBpAG4AcwAtAHMAbwBjAGMAMQA2AC0AcABvAHMAdABlAHIALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACwvTXkgRHJpdmUvUGFwZXJzL1cvd2F0a2lucy1zb2NjMTYtcG9zdGVyLnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJABCAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAco=},
+ booktitle = {SoCC'16},
+ date-added = {2016-12-21 23:16:32 +0000},
+ date-modified = {2020-01-04 21:46:57 -0700},
+ keywords = {shortpapers, declarative, storage, programmable},
+ month = {October 5-7},
+ title = {Brados: Declarative,Programmable Object Storage},
+ year = {2016}
+}
+
diff --git a/content/publication/watkins-socc-16-poster/index.md b/content/publication/watkins-socc-16-poster/index.md
new file mode 100644
index 00000000000..12b50cb8fb8
--- /dev/null
+++ b/content/publication/watkins-socc-16-poster/index.md
@@ -0,0 +1,12 @@
+---
+title: "Brados: Declarative,Programmable Object Storage"
+date: 2016-10-01
+publishDate: 2020-01-05T06:43:50.453146Z
+authors: ["Noah Watkins", "Michael Sevilla", "Ivo Jimenez", "Neha Ohja", "Peter Alvaro", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*SoCC'16*"
+tags: ["shortpapers", "declarative", "storage", "programmable"]
+---
+
diff --git a/content/publication/watkins-soetr-12/cite.bib b/content/publication/watkins-soetr-12/cite.bib
new file mode 100644
index 00000000000..c64df641251
--- /dev/null
+++ b/content/publication/watkins-soetr-12/cite.bib
@@ -0,0 +1,16 @@
+@techreport{watkins:soetr12,
+ abstract = {Cloud-based services have become an attractive alternative to in-house data centers because of their flexible, on-demand availability of compute and storage resources. This is also true for scientific high-performance computing (HPC) applications that are currently being run on expensive, dedicated hardware. One important challenge of HPC applications is their need to perform periodic global checkpoints of execution state to stable storage in order to recover from failures, but the checkpoint process can dominate the total run-time of HPC applications even in the failure-free case! In HPC architectures, dedicated stable storage is highly tuned for this type of workload using locality and physical layout policies, which are generally unknown in typical cloud environments. In this paper we introduce DataMods, an extended version of the Ceph file system and associated distributed object store RADOS, which are widely used in open source cloud stacks. DataMods extends object-based storage with extended services take advantage of common cloud data center node hardware configurations (i.e. CPU and local storage resources), and that can be used to construct efficient, scalable middleware services that span the entire storage stack and utilize asynchronous services for offline data management services.},
+ address = {Santa Cruz, CA},
+ author = {Noah Watkins and Carlos Maltzahn and Scott A. Brandt and Adam Manzanares},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAVVy93YXRraW5zLXNvZXRyMTIucGRmTxEBbAAAAAABbAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////E3dhdGtpbnMtc29ldHIxMi5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFXAAACADsvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOlc6d2F0a2lucy1zb2V0cjEyLnBkZgAADgAoABMAdwBhAHQAawBpAG4AcwAtAHMAbwBlAHQAcgAxADIALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACYvTXkgRHJpdmUvUGFwZXJzL1cvd2F0a2lucy1zb2V0cjEyLnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA8AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAaw=},
+ date-added = {2012-07-21 11:39:45 +0000},
+ date-modified = {2020-01-05 05:29:20 -0700},
+ institution = {University of California Santa Cruz},
+ keywords = {papers, filesystems, programming, datamanagement},
+ month = {July},
+ number = {UCSC-SOE-12-07},
+ title = {DataMods: Programmable File System Services},
+ type = {Technical Report},
+ year = {2012}
+}
+
diff --git a/content/publication/watkins-soetr-12/index.md b/content/publication/watkins-soetr-12/index.md
new file mode 100644
index 00000000000..01de588ed7d
--- /dev/null
+++ b/content/publication/watkins-soetr-12/index.md
@@ -0,0 +1,12 @@
+---
+title: "DataMods: Programmable File System Services"
+date: 2012-07-01
+publishDate: 2020-01-05T13:33:05.966641Z
+authors: ["Noah Watkins", "Carlos Maltzahn", "Scott A. Brandt", "Adam Manzanares"]
+publication_types: ["4"]
+abstract: "Cloud-based services have become an attractive alternative to in-house data centers because of their flexible, on-demand availability of compute and storage resources. This is also true for scientific high-performance computing (HPC) applications that are currently being run on expensive, dedicated hardware. One important challenge of HPC applications is their need to perform periodic global checkpoints of execution state to stable storage in order to recover from failures, but the checkpoint process can dominate the total run-time of HPC applications even in the failure-free case! In HPC architectures, dedicated stable storage is highly tuned for this type of workload using locality and physical layout policies, which are generally unknown in typical cloud environments. In this paper we introduce DataMods, an extended version of the Ceph file system and associated distributed object store RADOS, which are widely used in open source cloud stacks. DataMods extends object-based storage with extended services take advantage of common cloud data center node hardware configurations (i.e. CPU and local storage resources), and that can be used to construct efficient, scalable middleware services that span the entire storage stack and utilize asynchronous services for offline data management services."
+featured: false
+publication: ""
+tags: ["papers", "filesystems", "programming", "datamanagement"]
+---
+
diff --git a/content/publication/watkins-ucsctr-13/cite.bib b/content/publication/watkins-ucsctr-13/cite.bib
new file mode 100644
index 00000000000..3a2fa0ca465
--- /dev/null
+++ b/content/publication/watkins-ucsctr-13/cite.bib
@@ -0,0 +1,17 @@
+@techreport{watkins:ucsctr13,
+ abstract = {The emergence of high-performance open-source storage systems is allowing application and middleware developers to consider non-standard storage system interfaces. In contrast to the common practice of translating all I/O access onto the POSIX file interface, it will soon be common for application development to include the co-design of storage system interfaces. In order for developers to evolve a co-design in high-availability clusters, services are needed for in-vivo interface evolution that allows the development of interfaces in the context of a live system.
+Current clustered storage systems that provide interface customizability expose primitive services for managing static interfaces. For maximum utility, creating, evolving, and deploying dynamic storage interfaces is needed. However, in large-scale clusters, dynamic interface instantiation will require system-level support that ensures interface version consistency among storage nodes and clients. We propose that storage systems should provide services that fully manage the life-cycle of dynamic interfaces that are aligned with the common branch-and-merge form of software maintenance, including isolated development workspaces that can be combined into existing production views of the system.},
+ address = {Santa Cruz, CA},
+ author = {Noah Watkins and Carlos Maltzahn and Scott Brandt and Ian Pye and Adam Manzanares},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAWVy93YXRraW5zLXVjc2N0cjEzLnBkZk8RAXAAAAAAAXAAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xR3YXRraW5zLXVjc2N0cjEzLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABVwAAAgA8LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpXOndhdGtpbnMtdWNzY3RyMTMucGRmAA4AKgAUAHcAYQB0AGsAaQBuAHMALQB1AGMAcwBjAHQAcgAxADMALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACcvTXkgRHJpdmUvUGFwZXJzL1cvd2F0a2lucy11Y3NjdHIxMy5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAD0AAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABsQ==},
+ date-added = {2013-05-30 22:41:44 +0000},
+ date-modified = {2020-01-04 23:01:28 -0700},
+ institution = {University of California Santa Cruz},
+ keywords = {papers, datamodel, scripting, storage, systems, software-defined},
+ month = {March 16},
+ number = {UCSC-SOE-13-02},
+ title = {In-Vivo Storage System Development},
+ type = {Technical Report},
+ year = {2013}
+}
+
diff --git a/content/publication/watkins-ucsctr-13/index.md b/content/publication/watkins-ucsctr-13/index.md
new file mode 100644
index 00000000000..c65f84f4e2b
--- /dev/null
+++ b/content/publication/watkins-ucsctr-13/index.md
@@ -0,0 +1,12 @@
+---
+title: "In-Vivo Storage System Development"
+date: 2013-03-01
+publishDate: 2020-01-05T06:43:50.545312Z
+authors: ["Noah Watkins", "Carlos Maltzahn", "Scott Brandt", "Ian Pye", "Adam Manzanares"]
+publication_types: ["4"]
+abstract: "The emergence of high-performance open-source storage systems is allowing application and middleware developers to consider non-standard storage system interfaces. In contrast to the common practice of translating all I/O access onto the POSIX file interface, it will soon be common for application development to include the co-design of storage system interfaces. In order for developers to evolve a co-design in high-availability clusters, services are needed for in-vivo interface evolution that allows the development of interfaces in the context of a live system. Current clustered storage systems that provide interface customizability expose primitive services for managing static interfaces. For maximum utility, creating, evolving, and deploying dynamic storage interfaces is needed. However, in large-scale clusters, dynamic interface instantiation will require system-level support that ensures interface version consistency among storage nodes and clients. We propose that storage systems should provide services that fully manage the life-cycle of dynamic interfaces that are aligned with the common branch-and-merge form of software maintenance, including isolated development workspaces that can be combined into existing production views of the system."
+featured: false
+publication: ""
+tags: ["papers", "datamodel", "scripting", "storage", "systems", "software-defined"]
+---
+
diff --git a/content/publication/watkins-ucsctr-15/cite.bib b/content/publication/watkins-ucsctr-15/cite.bib
new file mode 100644
index 00000000000..cc41c1753fe
--- /dev/null
+++ b/content/publication/watkins-ucsctr-15/cite.bib
@@ -0,0 +1,16 @@
+@techreport{watkins:ucsctr15,
+ abstract = {As applications scale to new levels and migrate into cloud environments, there has been a significant departure from the exclusive reliance on the POSIX file I/O interface. However in doing so, application often discover a lack of services, forcing them to use bolt-on features or take on the respon- sibility of critical data management tasks. This often results in duplication of complex software with extreme correctness requirements. Instead, wouldn't it be nice if an application could just convey what it wanted out of a storage system, and have the storage system understand?
+The central question we address in this paper is whether or not the design delta between two storage systems can be expressed in a form such that one system becomes lit- tle more than a configuration of the other. Storage systems should expose their useful services in a way that separates performance from correctness, allowing for their safe reuse. After all, hardened code in storage systems protects count- less value, and its correctness is only as good as the stress we place on it. We demonstrate these concepts by synthesiz- ing the CORFU high-performance shared-log abstraction in Ceph through minor modifications of existing sub-systems that are orthogonal to correctness.},
+ author = {Noah Watkins and Michael Sevilla and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAWVy93YXRraW5zLXVjc2N0cjE1LnBkZk8RAXAAAAAAAXAAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xR3YXRraW5zLXVjc2N0cjE1LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABVwAAAgA8LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpXOndhdGtpbnMtdWNzY3RyMTUucGRmAA4AKgAUAHcAYQB0AGsAaQBuAHMALQB1AGMAcwBjAHQAcgAxADUALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACcvTXkgRHJpdmUvUGFwZXJzL1cvd2F0a2lucy11Y3NjdHIxNS5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAD0AAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABsQ==},
+ date-added = {2015-06-11 07:31:24 +0000},
+ date-modified = {2020-01-04 21:51:36 -0700},
+ institution = {UC Santa Cruz},
+ keywords = {papers, programmable, storage, systems},
+ month = {June 11},
+ number = {UCSC-SOE-15-12},
+ title = {The Case for Programmable Object Storage Systems},
+ type = {Tech. rept.},
+ year = {2015}
+}
+
diff --git a/content/publication/watkins-ucsctr-15/index.md b/content/publication/watkins-ucsctr-15/index.md
new file mode 100644
index 00000000000..aa62ac9414f
--- /dev/null
+++ b/content/publication/watkins-ucsctr-15/index.md
@@ -0,0 +1,12 @@
+---
+title: "The Case for Programmable Object Storage Systems"
+date: 2015-06-01
+publishDate: 2020-01-05T06:43:50.472832Z
+authors: ["Noah Watkins", "Michael Sevilla", "Carlos Maltzahn"]
+publication_types: ["4"]
+abstract: "As applications scale to new levels and migrate into cloud environments, there has been a significant departure from the exclusive reliance on the POSIX file I/O interface. However in doing so, application often discover a lack of services, forcing them to use bolt-on features or take on the respon- sibility of critical data management tasks. This often results in duplication of complex software with extreme correctness requirements. Instead, wouldn't it be nice if an application could just convey what it wanted out of a storage system, and have the storage system understand? The central question we address in this paper is whether or not the design delta between two storage systems can be expressed in a form such that one system becomes lit- tle more than a configuration of the other. Storage systems should expose their useful services in a way that separates performance from correctness, allowing for their safe reuse. After all, hardened code in storage systems protects count- less value, and its correctness is only as good as the stress we place on it. We demonstrate these concepts by synthesiz- ing the CORFU high-performance shared-log abstraction in Ceph through minor modifications of existing sub-systems that are orthogonal to correctness."
+featured: false
+publication: ""
+tags: ["papers", "programmable", "storage", "systems"]
+---
+
diff --git a/content/publication/watkins-ucsctr-16/cite.bib b/content/publication/watkins-ucsctr-16/cite.bib
new file mode 100644
index 00000000000..c358eb48bcb
--- /dev/null
+++ b/content/publication/watkins-ucsctr-16/cite.bib
@@ -0,0 +1,17 @@
+@techreport{watkins:ucsctr16,
+ abstract = {As applications scale to new levels and migrate into cloud environments, there has been a significant departure from the exclusive reliance on the POSIX file I/O interface. However in doing so, application often discover a lack of services, forcing them to use bolt-on features or take on the responsibility of critical data management tasks. This often results in duplication of complex software with extreme correctness requirements. Instead, wouldn't it be nice if an application could just convey what it wanted out of a storage system, and have the storage system understand?
+The central question we address in this paper is whether or not the design delta between two storage systems can be expressed in a form such that one system becomes little more than a configuration of the other. Storage systems should expose their useful services in a way that separates performance from correctness, allowing for their safe reuse. After all, hardened code in storage systems protects countless value, and its correctness is only as good as the stress we place on it. We demonstrate these concepts by synthesizing the CORFU high-performance shared-log abstraction in Ceph through minor modifications of existing sub-systems that are orthogonal to correctness.},
+ address = {Santa Cruz, CA},
+ author = {Noah Watkins and Michael Sevilla and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAWVy93YXRraW5zLXVjc2N0cjE2LnBkZk8RAXAAAAAAAXAAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xR3YXRraW5zLXVjc2N0cjE2LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABVwAAAgA8LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpXOndhdGtpbnMtdWNzY3RyMTYucGRmAA4AKgAUAHcAYQB0AGsAaQBuAHMALQB1AGMAcwBjAHQAcgAxADYALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACcvTXkgRHJpdmUvUGFwZXJzL1cvd2F0a2lucy11Y3NjdHIxNi5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAD0AAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABsQ==},
+ date-added = {2016-08-26 18:45:34 +0000},
+ date-modified = {2020-01-04 21:48:55 -0700},
+ institution = {UC Santa Cruz},
+ keywords = {papers, programmable, storage, systems},
+ month = {June 11},
+ number = {UCSC-SOE-15-12},
+ title = {The Case for Programmable Object Storage Systems},
+ type = {Tech. rept.},
+ year = {2016}
+}
+
diff --git a/content/publication/watkins-ucsctr-16/index.md b/content/publication/watkins-ucsctr-16/index.md
new file mode 100644
index 00000000000..734dcf2d56b
--- /dev/null
+++ b/content/publication/watkins-ucsctr-16/index.md
@@ -0,0 +1,12 @@
+---
+title: "The Case for Programmable Object Storage Systems"
+date: 2016-06-01
+publishDate: 2020-01-05T06:43:50.460442Z
+authors: ["Noah Watkins", "Michael Sevilla", "Carlos Maltzahn"]
+publication_types: ["4"]
+abstract: "As applications scale to new levels and migrate into cloud environments, there has been a significant departure from the exclusive reliance on the POSIX file I/O interface. However in doing so, application often discover a lack of services, forcing them to use bolt-on features or take on the responsibility of critical data management tasks. This often results in duplication of complex software with extreme correctness requirements. Instead, wouldn't it be nice if an application could just convey what it wanted out of a storage system, and have the storage system understand? The central question we address in this paper is whether or not the design delta between two storage systems can be expressed in a form such that one system becomes little more than a configuration of the other. Storage systems should expose their useful services in a way that separates performance from correctness, allowing for their safe reuse. After all, hardened code in storage systems protects countless value, and its correctness is only as good as the stress we place on it. We demonstrate these concepts by synthesizing the CORFU high-performance shared-log abstraction in Ceph through minor modifications of existing sub-systems that are orthogonal to correctness."
+featured: false
+publication: ""
+tags: ["papers", "programmable", "storage", "systems"]
+---
+
diff --git a/content/publication/weil-lsf-07/cite.bib b/content/publication/weil-lsf-07/cite.bib
new file mode 100644
index 00000000000..3c1566d0baa
--- /dev/null
+++ b/content/publication/weil-lsf-07/cite.bib
@@ -0,0 +1,13 @@
+@inproceedings{weil:lsf07,
+ address = {San Jose, CA},
+ author = {Sage Weil and Scott A. Brandt and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAQVy93ZWlsLWxzZjA3LnBkZk8RAVgAAAAAAVgAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////w53ZWlsLWxzZjA3LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABVwAAAgA2LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpXOndlaWwtbHNmMDcucGRmAA4AHgAOAHcAZQBpAGwALQBsAHMAZgAwADcALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACEvTXkgRHJpdmUvUGFwZXJzL1cvd2VpbC1sc2YwNy5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkADcAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABkw==},
+ booktitle = {Linux Storage and Filesystem Workshop (LSF07), held in conjunction with the Conference on File and Storage Technology (FAST 07)},
+ date-added = {2019-12-29 16:46:38 -0800},
+ date-modified = {2019-12-29 16:46:38 -0800},
+ keywords = {shortpapers, storage, scalable},
+ month = {February 12--13},
+ title = {Scaling Linux Storage to Petabytes},
+ year = {2007}
+}
+
diff --git a/content/publication/weil-lsf-07/index.md b/content/publication/weil-lsf-07/index.md
new file mode 100644
index 00000000000..c534bdc7157
--- /dev/null
+++ b/content/publication/weil-lsf-07/index.md
@@ -0,0 +1,12 @@
+---
+title: "Scaling Linux Storage to Petabytes"
+date: 2007-02-01
+publishDate: 2020-01-05T06:43:50.372678Z
+authors: ["Sage Weil", "Scott A. Brandt", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: ""
+featured: false
+publication: "*Linux Storage and Filesystem Workshop (LSF07), held in conjunction with the Conference on File and Storage Technology (FAST 07)*"
+tags: ["shortpapers", "storage", "scalable"]
+---
+
diff --git a/content/publication/weil-osdi-06/cite.bib b/content/publication/weil-osdi-06/cite.bib
new file mode 100644
index 00000000000..1cc9ef31fec
--- /dev/null
+++ b/content/publication/weil-osdi-06/cite.bib
@@ -0,0 +1,15 @@
+@inproceedings{weil:osdi06,
+ abstract = {provides excellent performance, reliability, and scalability. Ceph maximizes the separation between data and metadata management by replacing allocation tables with a pseudo-random data distribution function (CRUSH) designed for heterogeneous and dynamic clusters of unreliable object storage devices (OSDs). We leverage device intelligence by distributing data replication, failure detection and recovery to semi-autonomous OSDs running a specialized local object file system. A dynamic distributed metadata cluster provides extremely efficient metadata management and seamlessly adapts to a wide range of general purpose and scientific computing file system workloads. Performance measurements under a variety of workloads show that Ceph has excellent I/O performance and scalable metadata management, supporting more than 250,000 metadata operations per second.},
+ address = {Seattle, WA},
+ author = {Sage A. Weil and Scott A. Brandt and Ethan L. Miller and Darrell D. E. Long and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxARVy93ZWlsLW9zZGkwNi5wZGZPEQFcAAAAAAFcAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8Pd2VpbC1vc2RpMDYucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAVcAAAIANy86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6Vzp3ZWlsLW9zZGkwNi5wZGYAAA4AIAAPAHcAZQBpAGwALQBvAHMAZABpADAANgAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAIi9NeSBEcml2ZS9QYXBlcnMvVy93ZWlsLW9zZGkwNi5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAOAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGY},
+ booktitle = {OSDI'06},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:03:57 -0700},
+ keywords = {papers, parallel, filesystems, distributed, storage, systems, obsd, p2p},
+ month = {November},
+ read = {1},
+ title = {Ceph: A Scalable, High-Performance Distributed File System},
+ year = {2006}
+}
+
diff --git a/content/publication/weil-osdi-06/index.md b/content/publication/weil-osdi-06/index.md
new file mode 100644
index 00000000000..72f2cb9391f
--- /dev/null
+++ b/content/publication/weil-osdi-06/index.md
@@ -0,0 +1,12 @@
+---
+title: "Ceph: A Scalable, High-Performance Distributed File System"
+date: 2006-11-01
+publishDate: 2020-01-05T13:33:05.998495Z
+authors: ["Sage A. Weil", "Scott A. Brandt", "Ethan L. Miller", "Darrell D. E. Long", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: "provides excellent performance, reliability, and scalability. Ceph maximizes the separation between data and metadata management by replacing allocation tables with a pseudo-random data distribution function (CRUSH) designed for heterogeneous and dynamic clusters of unreliable object storage devices (OSDs). We leverage device intelligence by distributing data replication, failure detection and recovery to semi-autonomous OSDs running a specialized local object file system. A dynamic distributed metadata cluster provides extremely efficient metadata management and seamlessly adapts to a wide range of general purpose and scientific computing file system workloads. Performance measurements under a variety of workloads show that Ceph has excellent I/O performance and scalable metadata management, supporting more than 250,000 metadata operations per second."
+featured: false
+publication: "*OSDI'06*"
+tags: ["papers", "parallel", "filesystems", "distributed", "storage", "systems", "obsd", "p2p"]
+---
+
diff --git a/content/publication/weil-pdsw-07/cite.bib b/content/publication/weil-pdsw-07/cite.bib
new file mode 100644
index 00000000000..b6911bec229
--- /dev/null
+++ b/content/publication/weil-pdsw-07/cite.bib
@@ -0,0 +1,15 @@
+@inproceedings{weil:pdsw07,
+ abstract = {Brick and object-based storage architectures have emerged as a means of improving the scalability of storage clusters. However, existing systems continue to treat storage nodes as passive devices, despite their ability to exhibit significant intelligence and autonomy. We present the design and implementation of RADOS, a reliable object storage service that can scales to many thousands of devices by leveraging the intelligence present in individual storage nodes. RADOS preserves consistent data access and strong safety semantics while allowing nodes to act semi-autonomously to self-manage replication, failure detection, and failure recovery through the use of a small cluster map. Our implementation offers excellent performance, reliability, and scalability while providing clients with the illusion of a single logical object store.},
+ address = {Reno, NV},
+ author = {Sage A. Weil and Andrew Leung and Scott A. Brandt and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxARVy93ZWlsLXBkc3cwNy5wZGZPEQFcAAAAAAFcAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8Pd2VpbC1wZHN3MDcucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAVcAAAIANy86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6Vzp3ZWlsLXBkc3cwNy5wZGYAAA4AIAAPAHcAZQBpAGwALQBwAGQAcwB3ADAANwAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAIi9NeSBEcml2ZS9QYXBlcnMvVy93ZWlsLXBkc3cwNy5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAOAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGY},
+ booktitle = {Proceedings of the 2007 ACM Petascale Data Storage Workshop (PDSW 07)},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:20:07 -0700},
+ keywords = {papers, obsd, distributed, storage, systems, related:x10},
+ local-url = {/Users/carlosmalt/Documents/Papers/weil-pdsw07.pdf},
+ month = {November},
+ title = {RADOS: A Fast, Scalable, and Reliable Storage Service for Petabyte-scale Storage Clusters},
+ year = {2007}
+}
+
diff --git a/content/publication/weil-pdsw-07/index.md b/content/publication/weil-pdsw-07/index.md
new file mode 100644
index 00000000000..f08a318ff86
--- /dev/null
+++ b/content/publication/weil-pdsw-07/index.md
@@ -0,0 +1,12 @@
+---
+title: "RADOS: A Fast, Scalable, and Reliable Storage Service for Petabyte-scale Storage Clusters"
+date: 2007-11-01
+publishDate: 2020-01-05T13:33:06.014350Z
+authors: ["Sage A. Weil", "Andrew Leung", "Scott A. Brandt", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: "Brick and object-based storage architectures have emerged as a means of improving the scalability of storage clusters. However, existing systems continue to treat storage nodes as passive devices, despite their ability to exhibit significant intelligence and autonomy. We present the design and implementation of RADOS, a reliable object storage service that can scales to many thousands of devices by leveraging the intelligence present in individual storage nodes. RADOS preserves consistent data access and strong safety semantics while allowing nodes to act semi-autonomously to self-manage replication, failure detection, and failure recovery through the use of a small cluster map. Our implementation offers excellent performance, reliability, and scalability while providing clients with the illusion of a single logical object store."
+featured: false
+publication: "*Proceedings of the 2007 ACM Petascale Data Storage Workshop (PDSW 07)*"
+tags: ["papers", "obsd", "distributed", "storage", "systems", "related:x10"]
+---
+
diff --git a/content/publication/weil-sc-06/cite.bib b/content/publication/weil-sc-06/cite.bib
new file mode 100644
index 00000000000..afb372100be
--- /dev/null
+++ b/content/publication/weil-sc-06/cite.bib
@@ -0,0 +1,15 @@
+@inproceedings{weil:sc06,
+ abstract = {Emerging large-scale distributed storage systems are faced with the task of distributing petabytes of data among tens or hundreds of thousands of storage devices. Such systems must evenly distribute data and workload to efficiently utilize available resources and maximize system performance, while facilitating system growth and managing hardware failures. We have developed CRUSH, a scalable pseudo-random data distribution function designed for distributed object-based storage systems that efficiently maps data objects to storage devices without relying on a central directory. Because large systems are inherently dynamic, CRUSH is designed to facilitate the addition and removal of storage while minimizing unnecessary data movement. The algorithm accommodates a wide variety of data replication and reliability mechanisms and distributes data in terms of user-defined policies that enforce separation of replicas across failure domains.},
+ address = {Tampa, FL},
+ author = {Sage A. Weil and Scott A. Brandt and Ethan L. Miller and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAPVy93ZWlsLXNjMDYucGRmTxEBVAAAAAABVAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////DXdlaWwtc2MwNi5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFXAAACADUvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOlc6d2VpbC1zYzA2LnBkZgAADgAcAA0AdwBlAGkAbAAtAHMAYwAwADYALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACAvTXkgRHJpdmUvUGFwZXJzL1cvd2VpbC1zYzA2LnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA2AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAY4=},
+ booktitle = {SC '06},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:10:11 -0700},
+ keywords = {papers, hashing, parallel, filesystems, placement, related:ceph, obsd},
+ month = {November},
+ publisher = {ACM},
+ title = {CRUSH: Controlled, Scalable, Decentralized Placement of Replicated Data},
+ year = {2006}
+}
+
diff --git a/content/publication/weil-sc-06/index.md b/content/publication/weil-sc-06/index.md
new file mode 100644
index 00000000000..0e0ee0b3018
--- /dev/null
+++ b/content/publication/weil-sc-06/index.md
@@ -0,0 +1,12 @@
+---
+title: "CRUSH: Controlled, Scalable, Decentralized Placement of Replicated Data"
+date: 2006-11-01
+publishDate: 2020-01-05T13:33:06.000596Z
+authors: ["Sage A. Weil", "Scott A. Brandt", "Ethan L. Miller", "Carlos Maltzahn"]
+publication_types: ["1"]
+abstract: "Emerging large-scale distributed storage systems are faced with the task of distributing petabytes of data among tens or hundreds of thousands of storage devices. Such systems must evenly distribute data and workload to efficiently utilize available resources and maximize system performance, while facilitating system growth and managing hardware failures. We have developed CRUSH, a scalable pseudo-random data distribution function designed for distributed object-based storage systems that efficiently maps data objects to storage devices without relying on a central directory. Because large systems are inherently dynamic, CRUSH is designed to facilitate the addition and removal of storage while minimizing unnecessary data movement. The algorithm accommodates a wide variety of data replication and reliability mechanisms and distributes data in terms of user-defined policies that enforce separation of replicas across failure domains."
+featured: false
+publication: "*SC '06*"
+tags: ["papers", "hashing", "parallel", "filesystems", "placement", "related:ceph", "obsd"]
+---
+
diff --git a/content/publication/weil-tr-ucsc-ceph-06/cite.bib b/content/publication/weil-tr-ucsc-ceph-06/cite.bib
new file mode 100644
index 00000000000..7d8c0844f4a
--- /dev/null
+++ b/content/publication/weil-tr-ucsc-ceph-06/cite.bib
@@ -0,0 +1,14 @@
+@techreport{weil:tr-ucsc-ceph06,
+ author = {Sage A. Weil and Scott A. Brandt and Ethan L. Miller and Darrell D. E. Long and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAWVy93ZWlsLXBoZHRoZXNpczA3LnBkZk8RAXAAAAAAAXAAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xR3ZWlsLXBoZHRoZXNpczA3LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABVwAAAgA8LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpXOndlaWwtcGhkdGhlc2lzMDcucGRmAA4AKgAUAHcAZQBpAGwALQBwAGgAZAB0AGgAZQBzAGkAcwAwADcALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACcvTXkgRHJpdmUvUGFwZXJzL1cvd2VpbC1waGR0aGVzaXMwNy5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAD0AAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABsQ==},
+ bdsk-url-1 = {http://ceph.newdream.net/weil-thesis.pdf},
+ date-added = {2009-09-29 12:08:09 -0700},
+ date-modified = {2009-12-14 11:54:46 -0800},
+ institution = {University of California, Santa Cruz},
+ local-url = {/Users/carlosmalt/Documents/Papers/weil-tr-ucsc-ceph06.pdf},
+ month = {Jan},
+ number = {SSRC-06-02},
+ title = {Ceph: A Scalable Object-based Storage System},
+ year = {2006}
+}
+
diff --git a/content/publication/weil-tr-ucsc-ceph-06/index.md b/content/publication/weil-tr-ucsc-ceph-06/index.md
new file mode 100644
index 00000000000..b5e413c72fa
--- /dev/null
+++ b/content/publication/weil-tr-ucsc-ceph-06/index.md
@@ -0,0 +1,11 @@
+---
+title: "Ceph: A Scalable Object-based Storage System"
+date: 2006-01-01
+publishDate: 2020-01-05T06:43:50.645771Z
+authors: ["Sage A. Weil", "Scott A. Brandt", "Ethan L. Miller", "Darrell D. E. Long", "Carlos Maltzahn"]
+publication_types: ["4"]
+abstract: ""
+featured: false
+publication: ""
+---
+
diff --git a/content/publication/weil-tr-ucsc-crush-06/cite.bib b/content/publication/weil-tr-ucsc-crush-06/cite.bib
new file mode 100644
index 00000000000..cd3936d0a75
--- /dev/null
+++ b/content/publication/weil-tr-ucsc-crush-06/cite.bib
@@ -0,0 +1,13 @@
+@techreport{weil:tr-ucsc-crush06,
+ author = {Sage A. Weil and Scott A. Brandt and Ethan L. Miller and Carlos Maltzahn},
+ bdsk-url-1 = {http://www.ssrc.ucsc.edu/Papers/weil-tr-crush06.pdf},
+ date-added = {2009-09-29 12:08:09 -0700},
+ date-modified = {2009-12-14 11:54:46 -0800},
+ institution = {University of California, Santa Cruz},
+ local-url = {/Users/carlosmalt/Documents/Papers/weil-tr-ucsc-crush06.pdf},
+ month = {Jan},
+ number = {SSRC-06-01},
+ title = {CRUSH: Controlled, Scalable, Decentralized Placement of Replicated Data},
+ year = {2006}
+}
+
diff --git a/content/publication/weil-tr-ucsc-crush-06/index.md b/content/publication/weil-tr-ucsc-crush-06/index.md
new file mode 100644
index 00000000000..c1394a8aa6e
--- /dev/null
+++ b/content/publication/weil-tr-ucsc-crush-06/index.md
@@ -0,0 +1,11 @@
+---
+title: "CRUSH: Controlled, Scalable, Decentralized Placement of Replicated Data"
+date: 2006-01-01
+publishDate: 2020-01-05T06:43:50.647348Z
+authors: ["Sage A. Weil", "Scott A. Brandt", "Ethan L. Miller", "Carlos Maltzahn"]
+publication_types: ["4"]
+abstract: ""
+featured: false
+publication: ""
+---
+
diff --git a/content/publication/weil-tr-ucsc-rados-07/cite.bib b/content/publication/weil-tr-ucsc-rados-07/cite.bib
new file mode 100644
index 00000000000..801ad8087bc
--- /dev/null
+++ b/content/publication/weil-tr-ucsc-rados-07/cite.bib
@@ -0,0 +1,14 @@
+@techreport{weil:tr-ucsc-rados07,
+ author = {Sage Weil and Carlos Maltzahn and Scott A. Brandt},
+ bdsk-url-1 = {http://www.ssrc.ucsc.edu/Papers/weil-tr-rados07.pdf},
+ date-added = {2009-09-29 12:08:09 -0700},
+ date-modified = {2009-12-14 11:54:46 -0800},
+ institution = {University of California, Santa Cruz},
+ local-url = {/Users/carlosmalt/Documents/Papers/weil-tr-ucsc-rados07.pdf},
+ month = {Jan},
+ note = {Please notify the authors when citing this tech report in a paper for publication},
+ number = {SSRC-07-01},
+ title = {RADOS: A Reliable Autonomic Distributed Object Store},
+ year = {2007}
+}
+
diff --git a/content/publication/weil-tr-ucsc-rados-07/index.md b/content/publication/weil-tr-ucsc-rados-07/index.md
new file mode 100644
index 00000000000..3a371f72ee7
--- /dev/null
+++ b/content/publication/weil-tr-ucsc-rados-07/index.md
@@ -0,0 +1,11 @@
+---
+title: "RADOS: A Reliable Autonomic Distributed Object Store"
+date: 2007-01-01
+publishDate: 2020-01-05T06:43:50.659444Z
+authors: ["Sage Weil", "Carlos Maltzahn", "Scott A. Brandt"]
+publication_types: ["4"]
+abstract: ""
+featured: false
+publication: ""
+---
+
diff --git a/content/publication/zakaria-nixcon-22/cite.bib b/content/publication/zakaria-nixcon-22/cite.bib
new file mode 100644
index 00000000000..eba348447fb
--- /dev/null
+++ b/content/publication/zakaria-nixcon-22/cite.bib
@@ -0,0 +1,18 @@
+@unpublished{zakaria:nixcon22,
+ abstract = {Nix has introduced the world to store-based systems and ushered a new wave of reproducibility. These new systems however are built atop long established patterns and occasionally leverage them to band-aid over the problems Nix aims to solve.
+
+How much further can we leverage the store abstraction to rethink long valued established patterns in Unix based operating systems? This talk will introduce some of the simple improvements one can uncover starting at the linking phase of object building and process startup.
+
+The authors introduce Shrinkwrap which can greatly improve startup performance and further improve reproducibility for applications ported to Nix by making simple improvement to how libraries are discovered and leveraging the store further. Additional explorations for improvements during the linking phase will be discussed and explored. It's time we rethink everything.},
+ author = {Farid Zakaria and Tom Scogland and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAxLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1gtWi96YWthcmlhLW5peGNvbjIyLnBkZk8RAXgAAAAAAXgAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAN9z69VCRAAB/////xR6YWthcmlhLW5peGNvbjIyLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////348MogAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAADWC1aAAACADsvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOlgtWjp6YWthcmlhLW5peGNvbjIyLnBkZgAADgAqABQAegBhAGsAYQByAGkAYQAtAG4AaQB4AGMAbwBuADIAMgAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAOVVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL1gtWi96YWthcmlhLW5peGNvbjIyLnBkZgAAEwABLwAAFQACABH//wAAAAgADQAaACQAWAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHU},
+ bdsk-url-1 = {https://drive.google.com/file/d/1uFE5UfvteXxkM4KCOjbSh52yGPa2hZtg/view},
+ date-added = {2022-11-07 19:32:09 -0800},
+ date-modified = {2022-11-07 19:32:09 -0800},
+ keywords = {linking, reproducibility, packaging},
+ month = {October 20-22},
+ note = {NixCon 2022, Paris, France},
+ title = {Rethinking basic primitives for store based systems },
+ year = {2022}
+}
+
diff --git a/content/publication/zakaria-nixcon-22/index.md b/content/publication/zakaria-nixcon-22/index.md
new file mode 100644
index 00000000000..c8829fd2f98
--- /dev/null
+++ b/content/publication/zakaria-nixcon-22/index.md
@@ -0,0 +1,17 @@
+---
+title: "Rethinking basic primitives for store based systems "
+date: 2022-10-01
+publishDate: 2022-11-08T04:51:16.018609Z
+authors: ["Farid Zakaria", "Tom Scogland", "Carlos Maltzahn"]
+publication_types: ["3"]
+abstract: "Nix has introduced the world to store-based systems and ushered a new wave of reproducibility. These new systems however are built atop long established patterns and occasionally leverage them to band-aid over the problems Nix aims to solve. How much further can we leverage the store abstraction to rethink long valued established patterns in Unix based operating systems? This talk will introduce some of the simple improvements one can uncover starting at the linking phase of object building and process startup. The authors introduce Shrinkwrap which can greatly improve startup performance and further improve reproducibility for applications ported to Nix by making simple improvement to how libraries are discovered and leveraging the store further. Additional explorations for improvements during the linking phase will be discussed and explored. It's time we rethink everything."
+featured: false
+publication: "NixCon 2022, Paris, France, October 20-22"
+tags: ["linking", "reproducibility", "packaging"]
+projects:
+- packaging
+url_video: "https://drive.google.com/file/d/1uFE5UfvteXxkM4KCOjbSh52yGPa2hZtg/view"
+
+---
+
+
diff --git a/content/publication/zakaria-sc-22/cite.bib b/content/publication/zakaria-sc-22/cite.bib
new file mode 100644
index 00000000000..2b4538ab831
--- /dev/null
+++ b/content/publication/zakaria-sc-22/cite.bib
@@ -0,0 +1,13 @@
+@inproceedings{zakaria:sc22,
+ address = {Dallas, TX},
+ author = {Farid Zakaria and Thomas R. W. Scogland and Todd Gamblin and Carlos Maltzahn},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAtLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1gtWi96YWthcmlhLXNjMjIucGRmTxEBaAAAAAABaAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////EHpha2FyaWEtc2MyMi5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAANYLVoAAAIANy86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6WC1aOnpha2FyaWEtc2MyMi5wZGYAAA4AIgAQAHoAYQBrAGEAcgBpAGEALQBzAGMAMgAyAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA1VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvWC1aL3pha2FyaWEtc2MyMi5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFQAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABwA==},
+ booktitle = {SC22},
+ date-added = {2022-08-09 12:51:12 -0700},
+ date-modified = {2022-08-09 12:53:35 -0700},
+ keywords = {linking, packaging, softwareengineering, oss, reproducibility, compiler},
+ month = {November 13-18},
+ title = {Mapping Out the HPC Dependency Chaos},
+ year = {2022}
+}
+
diff --git a/content/publication/zakaria-sc-22/index.md b/content/publication/zakaria-sc-22/index.md
new file mode 100644
index 00000000000..1e589437220
--- /dev/null
+++ b/content/publication/zakaria-sc-22/index.md
@@ -0,0 +1,46 @@
+---
+# Documentation: https://wowchemy.com/docs/managing-content/
+
+title: Mapping Out the HPC Dependency Chaos
+subtitle: ''
+summary: ''
+authors:
+- Farid Zakaria
+- Thomas R. W. Scogland
+- Todd Gamblin
+- Carlos Maltzahn
+tags:
+- linking
+- packaging
+- softwareengineering
+- oss
+- reproducibility
+- compiler
+categories: []
+date: '2022-11-01'
+lastmod: 2022-08-09T12:58:40-07:00
+featured: false
+draft: false
+
+# Featured image
+# To use, add an image named `featured.jpg/png` to your page's folder.
+# Focal points: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight.
+image:
+ caption: ''
+ focal_point: ''
+ preview_only: false
+
+# Projects (optional).
+# Associate this post with one or more of your projects.
+# Simply enter your project's folder or file name without extension.
+# E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
+# Otherwise, set `projects = []`.
+projects:
+- practical-reproducibility
+- packaging
+publishDate: '2022-08-09T19:58:38.331246Z'
+publication_types:
+- '1'
+abstract: "High Performance Computing (HPC) software stacks have become complex, with the dependencies of some applications numbering in the hundreds. Packaging, distributing, and administering software stacks of that scale is a complex undertaking anywhere. HPC systems deal with esoteric compilers, hardware, and a panoply of uncommon combinations. In this paper, we explore the mechanisms available for packaging software to find its own dependencies in the context of a taxonomy of software distribution, and discuss their benefits and pitfalls. We discuss workarounds for some common problems caused by using these composed stacks and introduce Shrinkwrap: A solution to producing binaries that directly load their dependencies from precise locations and in a precise order. Beyond simplifying the use of the binaries, this approach also speeds up loading as much as 7× for a large dynamically-linked MPI application in our evaluation."
+publication: '*SC22*'
+---
diff --git a/data/fonts/carlos.toml b/data/fonts/carlos.toml
new file mode 100644
index 00000000000..5fcf14926cc
--- /dev/null
+++ b/data/fonts/carlos.toml
@@ -0,0 +1,12 @@
+# Font style metadata
+name = "Minimal"
+
+# Optional Google font URL
+google_fonts = "family=EB+Garamond:ital,wght@0,400;0,700;1,400&family=Roboto:wght@400;700&family=Roboto+Mono:ital,wght@0,400;0,700;1,400"
+
+# Font families
+heading_font = "Roboto"
+#body_font = "EB Garamond"
+body_font = "Roboto"
+nav_font = "Roboto"
+mono_font = "Roboto Mono"
diff --git a/data/themes/carlos.toml b/data/themes/carlos.toml
new file mode 100644
index 00000000000..9b486109a65
--- /dev/null
+++ b/data/themes/carlos.toml
@@ -0,0 +1,18 @@
+# Theme metadata
+name = "carlos"
+
+# Is theme light or dark?
+light = true
+
+# Primary
+primary = "#1f6daa"
+
+# Menu
+menu_primary = "#fff"
+menu_text = "#34495e"
+menu_text_active = "#1f6daa"
+menu_title = "#2b2b2b"
+
+# Home sections
+home_section_odd = "rgb(255, 255, 255)"
+home_section_even = "rgb(247, 247, 247)"
diff --git a/go.mod b/go.mod
new file mode 100644
index 00000000000..d81a054102e
--- /dev/null
+++ b/go.mod
@@ -0,0 +1,5 @@
+module github.com/carlosmalt/ucsc-homepage
+
+go 1.15
+
+require github.com/wowchemy/wowchemy-hugo-modules/v5 v5.5.0
\ No newline at end of file
diff --git a/go.sum b/go.sum
new file mode 100644
index 00000000000..299031069ad
--- /dev/null
+++ b/go.sum
@@ -0,0 +1,34 @@
+github.com/wowchemy/wowchemy-hugo-modules/netlify-cms-academic v0.0.0-20210120025205-e0ae7c979cda h1:pOETHk90q0NVqcpi1UcTCZsO0UuchtcyJUuluP+Ae7o=
+github.com/wowchemy/wowchemy-hugo-modules/netlify-cms-academic v0.0.0-20210120025205-e0ae7c979cda/go.mod h1:TU3QDPUdBSQnvDP5QVCwjAkBIdVMS/bKFA8jr3eI5AY=
+github.com/wowchemy/wowchemy-hugo-modules/v5 v5.3.0/go.mod h1:akNBhhT0UAOXSREplKkLe2wyHeo51qm6f+vqNkQkcmE=
+github.com/wowchemy/wowchemy-hugo-modules/v5 v5.4.0 h1:jWlAOA8e40owFc+tyHrQdRZGXQpxbWeml0vkhBftoHg=
+github.com/wowchemy/wowchemy-hugo-modules/v5 v5.4.0/go.mod h1:akNBhhT0UAOXSREplKkLe2wyHeo51qm6f+vqNkQkcmE=
+github.com/wowchemy/wowchemy-hugo-modules/v5 v5.5.0 h1:7mJrVZe/UeeNlaiB2A5OmBlrb/ieXVaMGQXM2r5wGUg=
+github.com/wowchemy/wowchemy-hugo-modules/v5 v5.5.0/go.mod h1:akNBhhT0UAOXSREplKkLe2wyHeo51qm6f+vqNkQkcmE=
+github.com/wowchemy/wowchemy-hugo-modules/wowchemy v0.0.0-20210120025205-e0ae7c979cda h1:DjweNGYmNRnVhG48yarIZQiYvLKub1s7K3mwSaSZXQs=
+github.com/wowchemy/wowchemy-hugo-modules/wowchemy v0.0.0-20210120025205-e0ae7c979cda/go.mod h1:H22qfH9qj3FWwsk7+bAZpmT24yRGNQURah2/IRwjbn8=
+github.com/wowchemy/wowchemy-hugo-modules/wowchemy v0.0.0-20210211185922-b811f9a1bb9c h1:rcd9H2OHyw3xiu2H140g22AUvYZPBfvynAyTfubpvg4=
+github.com/wowchemy/wowchemy-hugo-modules/wowchemy v0.0.0-20210211185922-b811f9a1bb9c/go.mod h1:H22qfH9qj3FWwsk7+bAZpmT24yRGNQURah2/IRwjbn8=
+github.com/wowchemy/wowchemy-hugo-modules/wowchemy v0.0.0-20210324194200-fda9f39d872e h1:pjf3ttOUrGyqXqFE5HD4zROl5nVD7X06ejz5TwLuwtk=
+github.com/wowchemy/wowchemy-hugo-modules/wowchemy v0.0.0-20210324194200-fda9f39d872e/go.mod h1:H22qfH9qj3FWwsk7+bAZpmT24yRGNQURah2/IRwjbn8=
+github.com/wowchemy/wowchemy-hugo-modules/wowchemy v0.0.0-20210525210730-89d079bcf055 h1:2aDKEgFpfUu0ebxGvCDUw/W4+UZx3wWseFsYpNvaqr4=
+github.com/wowchemy/wowchemy-hugo-modules/wowchemy v0.0.0-20210525210730-89d079bcf055/go.mod h1:H22qfH9qj3FWwsk7+bAZpmT24yRGNQURah2/IRwjbn8=
+github.com/wowchemy/wowchemy-hugo-modules/wowchemy-cms v0.0.0-20210211185922-b811f9a1bb9c h1:a5+8Lgpcomdgpz9pPDhKQ53RMbNkKmifQ3hNrYWA0dg=
+github.com/wowchemy/wowchemy-hugo-modules/wowchemy-cms v0.0.0-20210211185922-b811f9a1bb9c/go.mod h1:AKpYbqUVlj0VYsc7Jsxe1o8Ko2yV31A5ZPdfpACcXJw=
+github.com/wowchemy/wowchemy-hugo-modules/wowchemy-cms v0.0.0-20210324194200-fda9f39d872e h1:rhRiyEZRDSdtwpEzMmZdycHudtYrEyoHKfc8mxOpHwE=
+github.com/wowchemy/wowchemy-hugo-modules/wowchemy-cms v0.0.0-20210324194200-fda9f39d872e/go.mod h1:AKpYbqUVlj0VYsc7Jsxe1o8Ko2yV31A5ZPdfpACcXJw=
+github.com/wowchemy/wowchemy-hugo-modules/wowchemy-cms v0.0.0-20210525210730-89d079bcf055 h1:e/IE1WGzItQDeAX0o9cRdpLf9835ZbX5ttl/7f7nwTs=
+github.com/wowchemy/wowchemy-hugo-modules/wowchemy-cms v0.0.0-20210525210730-89d079bcf055/go.mod h1:AKpYbqUVlj0VYsc7Jsxe1o8Ko2yV31A5ZPdfpACcXJw=
+github.com/wowchemy/wowchemy-hugo-modules/wowchemy-cms/v5 v5.0.0-20210629192904-559885af86b7/go.mod h1:Sp/AKo+2HAPi/IPHp1MEdKPmee+mzO5+efUBUPLPqPE=
+github.com/wowchemy/wowchemy-hugo-modules/wowchemy/v5 v5.0.0-20210629192904-559885af86b7/go.mod h1:2iL9rdrUYyJXX2BeHKfK+QbqZlubCsaR60nQ87NRQTY=
+github.com/wowchemy/wowchemy-hugo-themes/modules/wowchemy-core v0.1.0/go.mod h1:kJwI9H8dicHQCnP8G9EvUDI+oNg/yXcGsjGjwjXuM8I=
+github.com/wowchemy/wowchemy-hugo-themes/modules/wowchemy-core v0.2.0/go.mod h1:kJwI9H8dicHQCnP8G9EvUDI+oNg/yXcGsjGjwjXuM8I=
+github.com/wowchemy/wowchemy-hugo-themes/modules/wowchemy-plugin-netlify v1.0.1-0.20230812165109-98c928a76715 h1:yIEUin2AbhMSRs6dQrnX8bxBwcmrVO42X+3X3Zxqrss=
+github.com/wowchemy/wowchemy-hugo-themes/modules/wowchemy-plugin-netlify v1.0.1-0.20230812165109-98c928a76715/go.mod h1:s40UgLsWfVyCLQ2F4F3dBcNfZOXcPGld7KxsKhZdzvM=
+github.com/wowchemy/wowchemy-hugo-themes/modules/wowchemy-plugin-netlify-cms v1.0.1-0.20230812165109-98c928a76715 h1:/JLEPNH/fXDXlEgd0t1n6keZ7ghWvgOGBpsKkSeevHM=
+github.com/wowchemy/wowchemy-hugo-themes/modules/wowchemy-plugin-netlify-cms v1.0.1-0.20230812165109-98c928a76715/go.mod h1:X1mETJo6Lkv9tEgfU0UYFRiRInf0RbgW+s1RKXB4GMA=
+github.com/wowchemy/wowchemy-hugo-themes/modules/wowchemy-plugin-reveal v0.0.0-20230812165109-98c928a76715 h1:FIMexgyswUDqr/Jn0ty0WJSX/1qL8fCnhKn64S/JOGI=
+github.com/wowchemy/wowchemy-hugo-themes/modules/wowchemy-plugin-reveal v0.0.0-20230812165109-98c928a76715/go.mod h1:u2hgU45C6Oi3CwMzSNvTwuRTsKs7O46EG3MTjNKu7gE=
+github.com/wowchemy/wowchemy-hugo-themes/modules/wowchemy-seo v0.1.0/go.mod h1:R01vz++1i/KR2n00aWGcs6m/L7ky1klbrpqA2KXjMCk=
+github.com/wowchemy/wowchemy-hugo-themes/modules/wowchemy/v5 v5.8.1 h1:aqJ7yg+JYu64bh0r/VpwT2KaqhEHxdiX8JbH92shr00=
+github.com/wowchemy/wowchemy-hugo-themes/modules/wowchemy/v5 v5.8.1/go.mod h1:aTjV1UoWPt03vYsb6TImLXMlby6p1lXF9aUy+GnAuqw=
diff --git a/i18n/en.yaml b/i18n/en.yaml
new file mode 100644
index 00000000000..881bd01f15b
--- /dev/null
+++ b/i18n/en.yaml
@@ -0,0 +1,243 @@
+# Navigation
+
+- id: toggle_navigation
+ translation: Toggle navigation
+
+- id: table_of_contents
+ translation: Table of Contents
+
+- id: on_this_page
+ translation: Contents
+
+- id: back_to_top
+ translation: Back to top
+
+# General
+
+- id: related
+ translation: Related
+
+- id: minute_read
+ translation: min read
+
+- id: previous
+ translation: Previous
+
+- id: next
+ translation: Next
+
+- id: figure
+ translation: "Figure %d:"
+
+- id: edit_page
+ translation: Edit this page
+
+# Buttons
+
+- id: btn_preprint
+ translation: Preprint
+
+- id: btn_pdf
+ translation: PDF
+
+- id: btn_cite
+ translation: Cite
+
+- id: btn_slides
+ translation: Slides
+
+- id: btn_video
+ translation: Video
+
+- id: btn_code
+ translation: Code
+
+- id: btn_dataset
+ translation: Dataset
+
+- id: btn_project
+ translation: Project
+
+- id: btn_poster
+ translation: Poster
+
+- id: btn_source
+ translation: Source Document
+
+- id: btn_copy
+ translation: Copy
+
+- id: btn_download
+ translation: Download
+
+# About widget
+
+- id: interests
+ translation: Interests
+
+- id: education
+ translation: Education
+
+- id: administration
+ translation: Administrative Staff
+
+- id: researchers
+ translation: Research Staff
+
+- id: currentphds
+ translation: Current Ph.D. Students
+
+- id: currentmss
+ translation: Current M.S. Students
+
+- id: graduatedphds
+ translation: Graduated Ph.D. Students
+
+- id: graduatedmss
+ translation: Graduated M.S. Students
+
+- id: user_profile_latest
+ translation: Latest
+
+# Accomplishments widget
+
+- id: see_certificate
+ translation: See certificate
+
+# Experience widget
+
+- id: present
+ translation: Present
+
+# Pages widget
+
+- id: more_pages
+ translation: See all
+
+- id: more_posts
+ translation: See all posts
+
+- id: more_talks
+ translation: See all talks
+
+- id: more_publications
+ translation: See all publications
+
+# Contact widget
+
+- id: contact_name
+ translation: Name
+
+- id: contact_email
+ translation: Email
+
+- id: contact_message
+ translation: Message
+
+- id: contact_send
+ translation: Send
+
+- id: book_appointment
+ translation: Book an appointment
+
+# Publication/Talk details
+
+- id: abstract
+ translation: Abstract
+
+- id: publication
+ translation: Publication
+
+- id: publication_type
+ translation: Type
+
+- id: date
+ translation: Date
+
+- id: last_updated
+ translation: Last updated on
+
+- id: event
+ translation: Event
+
+- id: location
+ translation: Location
+
+- id: pub_uncat
+ translation: Uncategorized
+
+- id: pub_conf
+ translation: Conference paper
+
+- id: pub_journal
+ translation: Journal article
+
+- id: pub_preprint
+ translation: Preprint
+
+- id: pub_report
+ translation: Report
+
+- id: pub_book
+ translation: Book
+
+- id: pub_book_section
+ translation: Book section
+
+- id: pub_thesis
+ translation: Thesis
+
+- id: pub_patent
+ translation: Patent
+
+# Project details
+
+- id: open_project_site
+ translation: Go to Project Site
+
+# Default titles for archive pages
+
+- id: posts
+ translation: Posts
+
+- id: publications
+ translation: Publications
+
+- id: talks
+ translation: Talks
+
+- id: projects
+ translation: Projects
+
+# Search
+
+- id: search
+ translation: Search
+
+- id: search_placeholder
+ translation: Search...
+
+- id: search_results
+ translation: results found
+
+- id: search_no_results
+ translation: No results found
+
+# Error 404
+
+- id: page_not_found
+ translation: Page not found
+
+- id: 404_recommendations
+ translation: Perhaps you were looking for one of these?
+
+# Cookie consent
+
+- id: cookie_message
+ translation: This website uses cookies to ensure you get the best experience on our website.
+
+- id: cookie_dismiss
+ translation: Got it!
+
+- id: cookie_learn
+ translation: Learn more
diff --git a/layouts/authors/list.html b/layouts/authors/list.html
new file mode 100644
index 00000000000..05e5dfda5c2
--- /dev/null
+++ b/layouts/authors/list.html
@@ -0,0 +1,39 @@
+{{/* Author profile page. */}}
+
+{{- define "main" -}}
+
+{{/* If an account has not been created for this user, just display their name as the title. */}}
+{{ if not .File }}
+
+
{{ .Title }}
+
+{{ end }}
+
+
+
+ {{/* Show the About widget if an account exists for this user. */}}
+ {{ if .File }}
+ {{ $widget := "widgets/aboutcm.html" }}
+ {{ $username := (path.Base (path.Split .Path).Dir) }}{{/* Alternatively, use `index .Params.authors 0` */}}
+ {{ $params := dict "root" $ "page" . "author" $username }}
+ {{ partial $widget $params }}
+ {{end}}
+
+ {{ $query := where .Pages ".IsNode" false }}
+ {{ $count := len $query }}
+ {{ if $count }}
+
+
+ {{/* Only display widget title in explicit instances of about widget, not in author pages. */}}
+ {{ if and $page.Params.widget $page.Title }}
{{ $page.Title | markdownify | emojify }}
{{ end }}
+
+ {{ $person_page.Content }}
+
+
+
+ {{ with $person.interests }}
+
+
{{ i18n "interests" | markdownify }}
+
+ {{ range . }}
+
{{ . | markdownify | emojify }}
+ {{ end }}
+
+
+ {{ end }}
+
+ {{ with $person.administration }}
+
+
{{ i18n "administration" | markdownify }}
+
+ {{ range . }}
+
{{ . | markdownify | emojify }}
+ {{ end }}
+
+
+ {{ end }}
+
+ {{ with $person.researchers }}
+
+
{{ i18n "researchers" | markdownify }}
+
+ {{ range . }}
+
{{ . | markdownify | emojify }}
+ {{ end }}
+
+
+ {{ end }}
+
+ {{ with $person.currentphds }}
+
+
{{ i18n "currentphds" | markdownify }}
+
+ {{ range . }}
+
{{ . | markdownify | emojify }}
+ {{ end }}
+
+
+ {{ end }}
+
+ {{ with $person.currentmss }}
+
+
{{ i18n "currentmss" | markdownify }}
+
+ {{ range . }}
+
{{ . | markdownify | emojify }}
+ {{ end }}
+
+
+ {{ end }}
+
+ {{ with $person.education }}
+
+
{{ i18n "education" | markdownify }}
+
+ {{ range .courses }}
+
+
+
+
{{ .course }}{{ with .year }}, {{ . }}{{ end }}
+
{{ .institution }}
+
+
+ {{ end }}
+
+
+ {{ end }}
+
+ {{ with $person.graduatedphds }}
+
+
{{ i18n "graduatedphds" | markdownify }}
+
+ {{ range .courses }}
+
+
+
+
{{ .course | markdownify | emojify }}{{ with .year }}, {{ . | markdownify | emojify }}{{ end }}
+
{{ .institution }}
+
+
+ {{ end }}
+
+
+ {{ end }}
+
+ {{ with $person.graduatedmss }}
+
+
{{ i18n "graduatedmss" | markdownify }}
+
+ {{ range .courses }}
+
+
+
+
{{ .course | markdownify | emojify }}{{ with .year }}, {{ . | markdownify | emojify }}{{ end }}
+
{{ .institution }}
+
+
+ {{ end }}
+
+
+ {{ end }}
+
+
+
+
+
diff --git a/maltzahn.bib b/maltzahn.bib
new file mode 100644
index 00000000000..5af7d54717c
--- /dev/null
+++ b/maltzahn.bib
@@ -0,0 +1,2321 @@
+%% This BibTeX bibliography file was created using BibDesk.
+%% https://bibdesk.sourceforge.io/
+
+%% Created for Carlos Maltzahn at 2023-09-03 17:15:09 -0700
+
+
+%% Saved with string encoding Unicode (UTF-8)
+
+
+
+@inproceedings{liu:hpec23,
+ abstract = {High-performance computing (HPC) systems researchers have proposed using current, programmable network interface cards (or SmartNICs) to offload data management services that would otherwise consume host processor cycles in a platform. While this work has successfully mapped data pipelines to a collection of SmartNICs, users require a flexible means of inspecting in-transit data to assess the live state of the system. In this paper, we explore SmartNIC-driven opportunistic query execution, i.e., enabling the SmartNIC to make a decision about whether to execute a query operation locally (i.e., ``offload'') or defer execution to the client (i.e., ``push-back''). Characterizations of different parts of the end-to-end query path allow the decision engine to make complexity predictions that would not be feasible by the client alone.},
+ address = {Virtual},
+ author = {Jianshen Liu and Carlos Maltzahn and Craig Ulmer},
+ booktitle = {HPEC '23},
+ date-added = {2023-08-29 19:45:03 -0700},
+ date-modified = {2023-08-29 19:56:34 -0700},
+ keywords = {papers, smartnics, querying, queryprocessing, streaming, streamprocessing, analysis},
+ month = {September 25-29},
+ title = {{Opportunistic Query Execution on SmartNICs for Analyzing In-Transit Data}},
+ year = {2023}}
+
+@inproceedings{ulmer:compsys23,
+ address = {St. Petersburg, FL, USA},
+ author = {Craig Ulmer and Jianshen Liu and Carlos Maltzahn and Matthew L. Curry},
+ booktitle = {2nd Workshop on Composable Systems (COMPSYS 2023, co-located with IPDPS 2023)},
+ date-added = {2023-03-09 10:29:28 -0800},
+ date-modified = {2023-03-09 10:30:50 -0800},
+ keywords = {smartnics, composability, datamanagement},
+ month = {May 15-19},
+ title = {{Extending Composable Data Services into SmartNICS (Best Paper Award)}},
+ year = {2023},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAwLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1UtVi91bG1lci1jb21wc3lzMjMucGRmTxEBcgAAAAABcgACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////E3VsbWVyLWNvbXBzeXMyMy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////gN9l2AAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAANVLVYAAAIAOi86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6VS1WOnVsbWVyLWNvbXBzeXMyMy5wZGYADgAoABMAdQBsAG0AZQByAC0AYwBvAG0AcABzAHkAcwAyADMALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADhVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9VLVYvdWxtZXItY29tcHN5czIzLnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABXAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAc0=}}
+
+@unpublished{amvrosiadis:nsfvision18,
+ author = {George Amvrosiadis and Ali R. Butt and Vasily Tarasov and Ming Zhao and others},
+ date-added = {2023-01-13 13:20:46 -0800},
+ date-modified = {2023-01-13 13:20:46 -0800},
+ keywords = {papers, vision, storage, systems, research},
+ month = {May 30 - June 1},
+ note = {Report on NSF Visioning Workshop},
+ title = {Data Storage Research Vision 2025},
+ year = {2018},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA2Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0EvYW12cm9zaWFkaXMtbnNmdmlzaW9uMTgucGRmTxEBjAAAAAABjAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////G2FtdnJvc2lhZGlzLW5zZnZpc2lvbjE4LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////X+gitAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFBAAACAEAvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkE6YW12cm9zaWFkaXMtbnNmdmlzaW9uMTgucGRmAA4AOAAbAGEAbQB2AHIAbwBzAGkAYQBkAGkAcwAtAG4AcwBmAHYAaQBzAGkAbwBuADEAOAAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAPlVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0EvYW12cm9zaWFkaXMtbnNmdmlzaW9uMTgucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAF0AAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAAB7Q==},
+ bdsk-url-1 = {https://www.overleaf.com/7988123186fbmpsqghjkgr}}
+
+@inproceedings{jimenez:agu18,
+ author = {Ivo Jimenez and Carlos Maltzahn},
+ booktitle = {AGU Fall Meeting},
+ date-added = {2023-01-11 22:59:55 -0800},
+ date-modified = {2023-01-11 23:06:28 -0800},
+ keywords = {reproducibility},
+ month = {December 12-14},
+ title = {Reproducible, Automated and Portable Computational and Data Science Experimentation Pipelines with Popper},
+ year = {2018},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAuLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0ktSi9qaW1lbmV6LWFndTE4LnBkZk8RAWoAAAAAAWoAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xFqaW1lbmV6LWFndTE4LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////3+TxDwAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAADSS1KAAACADgvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkktSjpqaW1lbmV6LWFndTE4LnBkZgAOACQAEQBqAGkAbQBlAG4AZQB6AC0AYQBnAHUAMQA4AC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA2VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvSS1KL2ppbWVuZXotYWd1MTgucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFUAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABww==}}
+
+@inproceedings{lefevre:snia20,
+ address = {Virtual},
+ author = {Jeff LeFevre and Carlos Maltzahn},
+ booktitle = {SNIA SDC 2020},
+ date-added = {2023-01-11 22:37:16 -0800},
+ date-modified = {2023-01-11 22:40:46 -0800},
+ keywords = {programmable, storage},
+ month = {September 23},
+ title = {SkyhookDM: Storage and Management of Tabular Data in Ceph},
+ year = {2020},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAtLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0wvbGVmZXZyZS1zbmlhMjAucGRmTxEBagAAAAABagACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////EmxlZmV2cmUtc25pYTIwLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////f5OqIAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFMAAACADcvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkw6bGVmZXZyZS1zbmlhMjAucGRmAAAOACYAEgBsAGUAZgBlAHYAcgBlAC0AcwBuAGkAYQAyADAALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADVVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9ML2xlZmV2cmUtc25pYTIwLnBkZgAAEwABLwAAFQACABH//wAAAAgADQAaACQAVAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHC},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA0Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0wvbGVmZXZyZS1zbmlhMjAtc2xpZGVzLnBkZk8RAYQAAAAAAYQAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xlsZWZldnJlLXNuaWEyMC1zbGlkZXMucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////3+TrUwAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABTAAAAgA+LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpMOmxlZmV2cmUtc25pYTIwLXNsaWRlcy5wZGYADgA0ABkAbABlAGYAZQB2AHIAZQAtAHMAbgBpAGEAMgAwAC0AcwBsAGkAZABlAHMALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADxVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9ML2xlZmV2cmUtc25pYTIwLXNsaWRlcy5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQAWwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHj}}
+
+@inproceedings{chakraborty:sdc21,
+ address = {Virtual},
+ author = {Jayjeet Chakraborty and Carlos Maltzahn},
+ booktitle = {SNIA SDC 2021},
+ date-added = {2023-01-11 22:30:29 -0800},
+ date-modified = {2023-01-11 22:32:09 -0800},
+ keywords = {programmable, storage},
+ month = {September 28-29},
+ title = {{SkyhookDM: An Arrow-Native Storage System}},
+ year = {2021},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAxLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0MvY2hha3JhYm9ydHktc25pYTIxLnBkZk8RAXoAAAAAAXoAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xZjaGFrcmFib3J0eS1zbmlhMjEucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////3+TplAAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABQwAAAgA7LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpDOmNoYWtyYWJvcnR5LXNuaWEyMS5wZGYAAA4ALgAWAGMAaABhAGsAcgBhAGIAbwByAHQAeQAtAHMAbgBpAGEAMgAxAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA5VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvQy9jaGFrcmFib3J0eS1zbmlhMjEucGRmAAATAAEvAAAVAAIAEf//AAAACAANABoAJABYAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAdY=},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA4Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0MvY2hha3JhYm9ydHktc25pYTIxLXNsaWRlcy5wZGZPEQGUAAAAAAGUAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8dY2hha3JhYm9ydHktc25pYTIxLXNsaWRlcy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9/k6/EAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUMAAAIAQi86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6QzpjaGFrcmFib3J0eS1zbmlhMjEtc2xpZGVzLnBkZgAOADwAHQBjAGgAYQBrAHIAYQBiAG8AcgB0AHkALQBzAG4AaQBhADIAMQAtAHMAbABpAGQAZQBzAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgBAVXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvQy9jaGFrcmFib3J0eS1zbmlhMjEtc2xpZGVzLnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABfAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAfc=}}
+
+@inproceedings{malik:precs22,
+ author = {Tanu Malik and Anjo Vahldiek-Oberwagner and Ivo Jimenez and Carlos Maltzahn},
+ booktitle = {P-RECS'22},
+ date-added = {2023-01-11 21:05:52 -0800},
+ date-modified = {2023-01-11 21:07:18 -0800},
+ doi = {10.1145/3526062.3536354},
+ keywords = {reproducibility},
+ title = {{Expanding the Scope of Artifact Evaluation at HPC Conferences: Experience of SC21}},
+ year = {2022},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAsLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL00vbWFsaWstcHJlY3MyMi5wZGZPEQFkAAAAAAFkAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8RbWFsaWstcHJlY3MyMi5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9/k1PsAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAU0AAAIANi86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6TTptYWxpay1wcmVjczIyLnBkZgAOACQAEQBtAGEAbABpAGsALQBwAHIAZQBjAHMAMgAyAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA0VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvTS9tYWxpay1wcmVjczIyLnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABTAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbs=}}
+
+@unpublished{zakaria:nixcon22,
+ abstract = {Nix has introduced the world to store-based systems and ushered a new wave of reproducibility. These new systems however are built atop long established patterns and occasionally leverage them to band-aid over the problems Nix aims to solve.
+
+How much further can we leverage the store abstraction to rethink long valued established patterns in Unix based operating systems? This talk will introduce some of the simple improvements one can uncover starting at the linking phase of object building and process startup.
+
+The authors introduce Shrinkwrap which can greatly improve startup performance and further improve reproducibility for applications ported to Nix by making simple improvement to how libraries are discovered and leveraging the store further. Additional explorations for improvements during the linking phase will be discussed and explored. It's time we rethink everything.
+},
+ author = {Farid Zakaria and Tom Scogland and Carlos Maltzahn},
+ date-added = {2022-11-07 19:32:09 -0800},
+ date-modified = {2022-11-07 19:32:09 -0800},
+ keywords = {linking, reproducibility, packaging},
+ month = {October 20-22},
+ note = {NixCon 2022, Paris, France},
+ title = {Rethinking basic primitives for store based systems},
+ year = {2022},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAxLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1gtWi96YWthcmlhLW5peGNvbjIyLnBkZk8RAXgAAAAAAXgAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xR6YWthcmlhLW5peGNvbjIyLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////348MogAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAADWC1aAAACADsvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOlgtWjp6YWthcmlhLW5peGNvbjIyLnBkZgAADgAqABQAegBhAGsAYQByAGkAYQAtAG4AaQB4AGMAbwBuADIAMgAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAOVVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL1gtWi96YWthcmlhLW5peGNvbjIyLnBkZgAAEwABLwAAFQACABH//wAAAAgADQAaACQAWAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHU},
+ bdsk-url-1 = {https://drive.google.com/file/d/1uFE5UfvteXxkM4KCOjbSh52yGPa2hZtg/view}}
+
+@unpublished{nsf:repeto22,
+ abstract = {The Repeto project will foster community practices to make reproducibility a part of mainstream research and education activities in computer science. The project seeks to understand the cost/benefit equation of reproducibility for the computer science systems community, the factors that make reproducibility feasible or infeasible, as well as isolate factors (be they technical or usage oriented) that make practical reproducibility of experiments difficult. This research coordination network will develop a range of activities from teaching methodology for packaging experiments for cost-effective replication; using reproducibility in teaching; collaboration with reproducibility initiatives sponsored through conferences and institutions; community events emphasizing repeating or replicating experiments such as hackathons, competitions, or rankings; fostering repositories of replicable experiments and monitoring their usage/replication; to reporting on state of art and emergent requirements for the support of practical reproducibility. The outcomes of the proposal will be a collection of computer science experiments replicable on open platforms, an understanding of how much and to what extent they are used in mainstream research and education activities via relevant metrics, as well as a series of reports on current enablers and obstacles towards mainstream use of reproducibility in computer science research.
+
+Replicable experiments will be created using platform programmability tools including the Chameleon environment and associated software such as CHI, Trovi, and Jupyter notebooks. This platform programmability approach allows experimenters to express complex experimental topologies in repeatable and persistent ways. Combining platform programmability with executable notebooks will allow investigators to capture the full experimental process for subsequent replication by other researchers.
+
+This award by the CISE Office of Advanced Cyberinfrastructure is jointly supported by the CISE Computer and Networked Systems Division.
+
+This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.},
+ author = {{National Science Foundation -- Office of Advanced Cyberinfrastructure (OAC)}},
+ date-added = {2022-08-16 17:33:00 -0700},
+ date-modified = {2022-08-16 18:27:26 -0700},
+ keywords = {funding},
+ month = {October},
+ note = {Available at www.nsf.gov/awardsearch/showAward?AWD\_ID=2226407},
+ title = {Collaborative Research: Disciplinary Improvements: Repeto: Building a Network for Practical Reproducibility in Experimental Computer Science},
+ year = {2022}}
+
+@inproceedings{liu:hpec22,
+ abstract = {Many distributed applications implement complex data flows and need a flexible mechanism for routing data between producers and consumers. Recent advances in programmable network interface cards, or SmartNICs, represent an opportunity to offload data-flow tasks into the network fabric, thereby freeing the hosts to perform other work. System architects in this space face multiple questions about the best way to leverage SmartNICs as processing elements in data flows. In this paper, we advocate the use of Apache Arrow as a foundation to implement data flow tasks on SmartNICs. We report on our experience adapting a partitioning algorithm for particle data to Apache Arrow and measure the on-card processing performance for the BlueField-2 SmartNIC. Our experiments confirm that the BlueField-2's (de)compression hardware can have a significant impact on in-transit workflows where data must be unpacked, processed, and repacked.},
+ address = {Virtual Event},
+ author = {Jianshen Liu and Carlos Maltzahn and Matthew L. Curry and Craig Ulmer},
+ booktitle = {2022 IEEE High Performance Extreme Computing Conference (IEEE HPEC 2022)},
+ date-added = {2022-08-16 17:08:46 -0700},
+ date-modified = {2022-08-16 18:44:04 -0700},
+ keywords = {smartnics, offloading, datamanagement, hpc},
+ month = {September 19-23},
+ title = {{Processing Particle Data Flows with SmartNICs}},
+ year = {2022},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxApLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0wvbGl1LWhwZWMyMi5wZGZPEQFaAAAAAAFaAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8ObGl1LWhwZWMyMi5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////981GukAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUwAAAIAMy86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6TDpsaXUtaHBlYzIyLnBkZgAADgAeAA4AbABpAHUALQBoAHAAZQBjADIAMgAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAMVVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0wvbGl1LWhwZWMyMi5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFAAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABrg==}}
+
+@inproceedings{zakaria:sc22,
+ abstract = {High Performance Computing (HPC) software stacks have become complex, with the dependencies of some applications numbering in the hundreds. Packaging, distributing, and administering software stacks of that scale is a complex undertaking anywhere. HPC systems deal with esoteric compilers, hardware, and a panoply of uncommon combinations. In this paper, we explore the mechanisms available for packaging software to find its own dependencies in the context of a taxonomy of software distribution, and discuss their benefits and pitfalls. We discuss workarounds for some common problems caused by using these composed stacks and introduce Shrinkwrap: A solution to producing binaries that directly load their dependencies from precise locations and in a precise order. Beyond simplifying the use of the binaries, this approach also speeds up loading as much as 7× for a large dynamically-linked MPI application in our evaluation.},
+ address = {Dallas, TX},
+ author = {Farid Zakaria and Thomas R. W. Scogland and Todd Gamblin and Carlos Maltzahn},
+ booktitle = {SC22},
+ date-added = {2022-08-09 12:51:12 -0700},
+ date-modified = {2022-08-16 18:42:03 -0700},
+ keywords = {linking, packaging, softwareengineering, oss, reproducibility, compiler},
+ month = {November 13-18},
+ title = {Mapping Out the HPC Dependency Chaos},
+ year = {2022},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAtLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1gtWi96YWthcmlhLXNjMjIucGRmTxEBaAAAAAABaAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////EHpha2FyaWEtc2MyMi5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////fGAhHAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAANYLVoAAAIANy86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6WC1aOnpha2FyaWEtc2MyMi5wZGYAAA4AIgAQAHoAYQBrAGEAcgBpAGEALQBzAGMAMgAyAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA1VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvWC1aL3pha2FyaWEtc2MyMi5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFQAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABwA==}}
+
+@unpublished{sloan:ucospo22,
+ author = {{Alfred P. Sloan Foundation -- Better Software for Science Program}},
+ date-added = {2022-08-04 06:46:49 -0700},
+ date-modified = {2022-08-04 06:50:01 -0700},
+ keywords = {funding},
+ month = {January},
+ note = {Available at sloan.org/grant-detail/9723},
+ title = {{To pilot a postdoctoral fellowship on open source software development and support other activities at the University of California Santa Cruz Open Source Program Office}},
+ year = {2022}}
+
+@article{lieggi:rhrq22,
+ author = {Stephanie Lieggi},
+ date-added = {2022-05-10 16:11:16 -0700},
+ date-modified = {2022-05-10 16:11:48 -0700},
+ journal = {Red Hat Research Quarterly},
+ keywords = {oss, ospo, academia},
+ month = {February},
+ number = {4},
+ pages = {5--6},
+ title = {Building a university {OSPO}: Bolstering academic research through open source},
+ volume = {3},
+ year = {2022},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAsLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0wvbGllZ2dpLXJocnEyMi5wZGZPEQFkAAAAAAFkAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8RbGllZ2dpLXJocnEyMi5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////96gPtQAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUwAAAIANi86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6TDpsaWVnZ2ktcmhycTIyLnBkZgAOACQAEQBsAGkAZQBnAGcAaQAtAHIAaAByAHEAMgAyAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA0VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvTC9saWVnZ2ktcmhycTIyLnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABTAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbs=}}
+
+@unpublished{chakraborty:arrowblog22,
+ author = {Jayjeet Chakraborty and Carlos Maltzahn and David Li and Tom Drabas},
+ date-added = {2022-05-06 12:28:50 -0700},
+ date-modified = {2022-05-06 12:28:50 -0700},
+ keywords = {computation, storage, programmable, datamanagement, ceph, arrow},
+ month = {January 31},
+ note = {Available at arrow.apache.org/blog/2022/01/31/skyhook-bringing-computation-to-storage-with-apache-arrow/},
+ title = {Skyhook: Bringing Computation to Storage with Apache Arrow},
+ year = {2022},
+ bdsk-url-1 = {https://arrow.apache.org/blog/2022/01/31/skyhook-bringing-computation-to-storage-with-apache-arrow/}}
+
+@inproceedings{chakraborty:ccgrid22,
+ abstract = {With the ever-increasing dataset sizes, several file formats such as Parquet, ORC, and Avro have been developed to store data efficiently, save the network, and interconnect bandwidth at the price of additional CPU utilization. However, with the advent of networks supporting 25-100 Gb/s and storage devices delivering 1,000,000 reqs/sec, the CPU has become the bottleneck trying to keep up feeding data in and out of these fast devices. The result is that data access libraries executed on single clients are often CPU-bound and cannot utilize the scale-out benefits of distributed storage systems. One attractive solution to this problem is to offload data-reducing processing and filtering tasks to the storage layer. However, modifying legacy storage systems to support compute offloading is often tedious and requires an extensive understanding of the system internals. Previous approaches re-implemented functionality of data processing frameworks and access libraries for a particular storage system, a duplication of effort that might have to be repeated for different storage systems.
+
+This paper introduces a new design paradigm that allows extending programmable object storage systems to embed existing, widely used data processing frameworks and access libraries into the storage layer with no modifications. In this approach, data processing frameworks and access libraries can evolve independently from storage systems while leveraging distributed storage systems' scale-out and availability properties. We present Skyhook, an example implementation of our design paradigm using Ceph, Apache Arrow, and Parquet. We provide a brief performance evaluation of Skyhook and discuss key results.},
+ address = {Taormina (Messina), Italy},
+ author = {Jayjeet Chakraborty and Ivo Jimenez and Sebastiaan Alvarez Rodriguez and Alexandru Uta and Jeff LeFevre and Carlos Maltzahn},
+ booktitle = {CCGrid22},
+ date-added = {2022-04-11 19:45:31 -0700},
+ date-modified = {2022-04-11 19:57:58 -0700},
+ keywords = {papers, programmable, storage, systems, arrow, nsf1836650, nsf1705021, nsf1764102},
+ month = {May 16-19},
+ title = {Skyhook: Towards an Arrow-Native Storage System},
+ year = {2022},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAzLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0MvY2hha3JhYm9ydHktY2NncmlkMjIucGRmTxEBggAAAAABggACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////GGNoYWtyYWJvcnR5LWNjZ3JpZDIyLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////eejbkAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFDAAACAD0vOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkM6Y2hha3JhYm9ydHktY2NncmlkMjIucGRmAAAOADIAGABjAGgAYQBrAHIAYQBiAG8AcgB0AHkALQBjAGMAZwByAGkAZAAyADIALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADtVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9DL2NoYWtyYWJvcnR5LWNjZ3JpZDIyLnBkZgAAEwABLwAAFQACABH//wAAAAgADQAaACQAWgAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHg}}
+
+@article{harrell:tpds22,
+ abstract = {In this special section we bring you a practice and experience effort in reproducibility for large-scale computational science at SC20. This section includes nine critiques, each by a student team that reproduced results from a paper published at SC19, during the following year's Student Cluster Competition. The paper is also included in this section and has been expanded upon, now including an analysis of the outcomes of the students' reproducibility experiments. Lastly, this special section encapsulates a variety of advances in reproducibility in the SC conference series technical program.},
+ author = {Stephen Lien Harrell and Scott Michael and Carlos Maltzahn},
+ date-added = {2022-04-11 19:38:53 -0700},
+ date-modified = {2022-04-11 19:42:38 -0700},
+ journal = {IEEE Transactions on Parallel and Distributed Systems},
+ keywords = {reproducibility, conference, hpc},
+ month = {September},
+ number = {9},
+ pages = {2011--2013},
+ title = {Advancing Adoption of Reproducibility in HPC: A Preface to the Special Section},
+ volume = {33},
+ year = {2022},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAtLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0gvaGFycmVsbC10cGRzMjIucGRmTxEBagAAAAABagACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////EmhhcnJlbGwtdHBkczIyLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////eejVfAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFIAAACADcvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkg6aGFycmVsbC10cGRzMjIucGRmAAAOACYAEgBoAGEAcgByAGUAbABsAC0AdABwAGQAcwAyADIALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADVVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9IL2hhcnJlbGwtdHBkczIyLnBkZgAAEwABLwAAFQACABH//wAAAAgADQAaACQAVAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHC}}
+
+@inproceedings{rodriguez:bigdata21,
+ abstract = {Distributed data processing ecosystems are widespread and their components are highly specialized, such that efficient interoperability is urgent. Recently, Apache Arrow was chosen by the community to serve as a format mediator, providing efficient in-memory data representation. Arrow enables efficient data movement between data processing and storage engines, significantly improving interoperability and overall performance. In this work, we design a new zero-cost data interoperability layer between Apache Spark and Arrow-based data sources through the Arrow Dataset API. Our novel data interface helps separate the computation (Spark) and data (Arrow) layers. This enables practitioners to seamlessly use Spark to access data from all Arrow Dataset API-enabled data sources and frameworks. To benefit our community, we open-source our work and show that consuming data through Apache Arrow is zero-cost: our novel data interface is either on-par or more performant than native Spark.},
+ address = {Virtual Event},
+ author = {Sebastiaan Alvarez Rodriguez and Jayjeet Chakraborty and Aaron Chu and Ivo Jimenez and Jeff LeFevre and Carlos Maltzahn and Alexandru Uta},
+ booktitle = {2021 IEEE International Conference on Big Data (IEEE BigData 2021)},
+ date-added = {2022-04-11 19:33:51 -0700},
+ date-modified = {2022-04-11 19:59:07 -0700},
+ keywords = {papers, spark, arrow, performance, nsf1836650},
+ month = {December 15-18},
+ title = {Zero-Cost, Arrow-Enabled Data Interface for Apache Spark},
+ year = {2021},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA0Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1EtUi9yb2RyaWd1ZXotYmlnZGF0YTIxLnBkZk8RAYIAAAAAAYIAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xdyb2RyaWd1ZXotYmlnZGF0YTIxLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////3nozqQAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAADUS1SAAACAD4vOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOlEtUjpyb2RyaWd1ZXotYmlnZGF0YTIxLnBkZgAOADAAFwByAG8AZAByAGkAZwB1AGUAegAtAGIAaQBnAGQAYQB0AGEAMgAxAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA8VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvUS1SL3JvZHJpZ3Vlei1iaWdkYXRhMjEucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFsAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAAB4Q==}}
+
+@unpublished{rodriguez:arxiv21,
+ abstract = {Distributed data processing ecosystems are widespread and their components are highly specialized, such that efficient interoperability is urgent. Recently, Apache Arrow was chosen by the community to serve as a format mediator, providing efficient in-memory data representation. Arrow enables efficient data movement between data processing and storage engines, significantly improving interoperability and overall performance. In this work, we design a new zero-cost data interoperability layer between Apache Spark and Arrow-based data sources through the Arrow Dataset API. Our novel data interface helps separate the computation (Spark) and data (Arrow) layers. This enables practitioners to seamlessly use Spark to access data from all Arrow Dataset API-enabled data sources and frameworks. To benefit our community, we open-source our work and show that consuming data through Apache Arrow is zero-cost: our novel data interface is either on-par or more performant than native Spark.},
+ author = {Sebastiaan Alvarez Rodriguez and Jayjeet Chakraborty and Aaron Chu and Ivo Jimenez and Jeff LeFevre and Carlos Maltzahn and Alexandru Uta},
+ date-added = {2021-07-23 11:42:12 -0700},
+ date-modified = {2021-07-23 11:55:28 -0700},
+ keywords = {papers, spark, arrow, performance},
+ month = {June 24},
+ note = {arxiv.org/abs/2106.13020 [cs.DC]},
+ title = {Zero-Cost, Arrow-Enabled Data Interface for Apache Spark},
+ year = {2021},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAyLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1EtUi9yb2RyaWd1ZXotYXJ4aXYyMS5wZGZPEQF6AAAAAAF6AAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8Vcm9kcmlndWV6LWFyeGl2MjEucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////90gXGEAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAA1EtUgAAAgA8LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpRLVI6cm9kcmlndWV6LWFyeGl2MjEucGRmAA4ALAAVAHIAbwBkAHIAaQBnAHUAZQB6AC0AYQByAHgAaQB2ADIAMQAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAOlVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL1EtUi9yb2RyaWd1ZXotYXJ4aXYyMS5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQAWQAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHX}}
+
+@unpublished{liu:arxiv21,
+ abstract = {High-performance computing (HPC) researchers have long envisioned scenarios where application workflows could be improved through the use of programmable processing elements embedded in the network fabric. Recently, vendors have introduced programmable Smart Network Interface Cards (SmartNICs) that enable computations to be offloaded to the edge of the network. There is great interest in both the HPC and high-performance data analytics communities in understanding the roles these devices may play in the data paths of upcoming systems.
+
+This paper focuses on characterizing both the networking and computing aspects of NVIDIA's new BlueField-2 SmartNIC when used in an Ethernet environment. For the networking evaluation we conducted multiple transfer experiments between processors located at the host, the SmartNIC, and a remote host. These tests illuminate how much processing headroom is available on the SmartNIC during transfers. For the computing evaluation we used the stress-ng benchmark to compare the BlueField-2 to other servers and place realistic bounds on the types of offload operations that are appropriate for the hardware.
+
+Our findings from this work indicate that while the BlueField-2 provides a flexible means of processing data at the network's edge, great care must be taken to not overwhelm the hardware. While the host can easily saturate the network link, the SmartNIC's embedded processors may not have enough computing resources to sustain more than half the expected bandwidth when using kernel-space packet processing. From a computational perspective, encryption operations, memory operations under contention, and on-card IPC operations on the SmartNIC perform significantly better than the general-purpose servers used for comparisons in our experiments. Therefore, applications that mainly focus on these operations may be good candidates for offloading to the SmartNIC. },
+ author = {Jianshen Liu and Carlos Maltzahn and Craig Ulmer and Matthew Leon Curry},
+ date-added = {2021-07-23 11:37:49 -0700},
+ date-modified = {2021-07-23 12:02:34 -0700},
+ keywords = {papers, smartnics, performance},
+ month = {May 14},
+ note = {arxiv.org/abs/2105.06619 [cs.NI]},
+ title = {Performance Characteristics of the BlueField-2 SmartNIC},
+ year = {2021},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAqLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0wvbGl1LWFyeGl2MjEucGRmTxEBXAAAAAABXAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////D2xpdS1hcnhpdjIxLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////dIFtZAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFMAAACADQvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkw6bGl1LWFyeGl2MjEucGRmAA4AIAAPAGwAaQB1AC0AYQByAHgAaQB2ADIAMQAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAMlVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0wvbGl1LWFyeGl2MjEucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFEAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABsQ==},
+ bdsk-url-1 = {https://www.nextplatform.com/2021/05/24/testing-the-limits-of-the-bluefield-2-smartnic/}}
+
+@unpublished{chakraborty:arxiv21,
+ abstract = {With the ever-increasing dataset sizes, several file formats like Parquet, ORC, and Avro have been developed to store data efficiently and to save network and interconnect bandwidth at the price of additional CPU utilization. However, with the advent of networks supporting 25-100 Gb/s and storage devices delivering 1,000,000 reqs/sec the CPU has become the bottleneck, trying to keep up feeding data in and out of these fast devices. The result is that data access libraries executed on single clients are often CPU-bound and cannot utilize the scale-out benefits of distributed storage systems. One attractive solution to this problem is to offload data-reducing processing and filtering tasks to the storage layer. However, modifying legacy storage systems to support compute offloading is often tedious and requires extensive understanding of the internals. Previous approaches re-implemented functionality of data processing frameworks and access library for a particular storage system, a duplication of effort that might have to be repeated for different storage systems. In this paper, we introduce a new design paradigm that allows extending programmable object storage systems to embed existing, widely used data processing frameworks and access libraries into the storage layer with minimal modifications. In this approach data processing frameworks and access libraries can evolve independently from storage systems while leveraging the scale-out and availability properties of distributed storage systems. We present one example implementation of our design paradigm using Ceph, Apache Arrow, and Parquet. We provide a brief performance evaluation of our implementation and discuss key results. },
+ author = {Jayjeet Chakraborty and Ivo Jimenez and Sebastiaan Alvarez Rodriguez and Alexandru Uta and Jeff LeFevre and Carlos Maltzahn},
+ date-added = {2021-07-23 10:50:21 -0700},
+ date-modified = {2021-07-23 13:47:37 -0700},
+ keywords = {papers, programmable, storage, systems, arrow},
+ month = {May 21},
+ note = {arxiv.org/abs/2105.09894 [cs.DC]},
+ title = {Towards an Arrow-native Storage System},
+ year = {2021},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAyLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0MvY2hha3JhYm9ydHktYXJ4aXYyMS5wZGZPEQF8AAAAAAF8AAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8XY2hha3JhYm9ydHktYXJ4aXYyMS5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9zcGnQAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUMAAAIAPC86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6QzpjaGFrcmFib3J0eS1hcnhpdjIxLnBkZgAOADAAFwBjAGgAYQBrAHIAYQBiAG8AcgB0AHkALQBhAHIAeABpAHYAMgAxAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA6VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvQy9jaGFrcmFib3J0eS1hcnhpdjIxLnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABZAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAdk=}}
+
+@article{chu:epjconf20,
+ abstract = {Access libraries such as ROOT and HDF5 allow users to interact with datasets using high level abstractions, like coordinate systems and associated slicing operations. Unfortunately, the implementations of access libraries are based on outdated assumptions about storage systems interfaces and are generally unable to fully benefit from modern fast storage devices. For example, access libraries often implement buffering and data layout that assume that large, single-threaded sequential access patterns are causing less overall latency than small parallel random access: while this is true for spinning media, it is not true for flash media. The situation is getting worse with rapidly evolving storage devices such as non-volatile memory and ever larger datasets. Our Skyhook Dataset Mapping project explores distributed dataset mapping infrastructures that can integrate and scale out existing access libraries using Ceph's extensible object model, avoiding reimplementation or even modifications of these access libraries as much as possible. These programmable storage extensions coupled with our distributed dataset mapping techniques enable: 1) access library operations to be offloaded to storage system servers, 2) the independent evolution of access libraries and storage systems and 3) fully leveraging of the existing load balancing, elasticity, and failure management of distributed storage systems like Ceph. They also create more opportunities to conduct storage server-local optimizations specific to storage servers. For example, storage servers might include local key/value stores combined with chunk stores that require different optimizations than a local file system. As storage servers evolve to support new storage devices like non-volatile memory, these server-local optimizations can be implemented while minimizing disruptions to applications. We will report progress on the means by which distributed dataset mapping can be abstracted over particular access libraries, including access libraries for ROOT data, and how we address some of the challenges revolving around data partitioning and composability of access operations.},
+ author = {Aaron Chu and Jeff LeFevre and Carlos Maltzahn and Aldrin Montana and Peter Alvaro and Dana Robinson and Quincey Koziol},
+ date-added = {2020-12-10 16:45:30 -0800},
+ date-modified = {2022-07-02 17:49:58 -0700},
+ journal = {EPJ Web Conf.},
+ keywords = {papers, programmable, declarative, objectstorage, nsf1836650},
+ month = {November 16},
+ number = {2020},
+ title = {Mapping Datasets to Programmable Storage},
+ volume = {245, 04037},
+ year = {2020},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAsLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0MvY2h1LWVwamNvbmYyMC5wZGZPEQFkAAAAAAFkAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8RY2h1LWVwamNvbmYyMC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9v4AnQAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUMAAAIANi86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6QzpjaHUtZXBqY29uZjIwLnBkZgAOACQAEQBjAGgAdQAtAGUAcABqAGMAbwBuAGYAMgAwAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA0VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvQy9jaHUtZXBqY29uZjIwLnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABTAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbs=},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAwLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0MvY2h1LWNoZXAxOS1zbGlkZXMucGRmTxEBdAAAAAABdAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////FWNodS1jaGVwMTktc2xpZGVzLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////bR5eSAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFDAAACADovOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkM6Y2h1LWNoZXAxOS1zbGlkZXMucGRmAA4ALAAVAGMAaAB1AC0AYwBoAGUAcAAxADkALQBzAGwAaQBkAGUAcwAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAOFVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0MvY2h1LWNoZXAxOS1zbGlkZXMucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFcAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABzw==},
+ bdsk-url-1 = {https://indico.cern.ch/event/773049/contributions/3474413/}}
+
+@inproceedings{lieggi:rse-hpc20,
+ author = {Stephanie Lieggi and Ivo Jimenez and Jeff LeFevre and Carlos Maltzahn},
+ booktitle = {RSE-HPC -- Introduction: Research Software Engineers in HPC: Creating Community, Building Careers, Addressing Challenges, co-located with SC20},
+ date-added = {2020-11-30 12:29:24 -0800},
+ date-modified = {2020-11-30 12:31:45 -0800},
+ keywords = {papers, softwareengineering, oss, cross},
+ month = {November 12},
+ title = {The CROSS Incubator: A Case Study for funding and training RSEs},
+ year = {2020},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAvLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0wvbGllZ2dpLXJzZS1ocGMyMC5wZGZPEQFyAAAAAAFyAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8UbGllZ2dpLXJzZS1ocGMyMC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////97rH54AAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUwAAAIAOS86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6TDpsaWVnZ2ktcnNlLWhwYzIwLnBkZgAADgAqABQAbABpAGUAZwBnAGkALQByAHMAZQAtAGgAcABjADIAMAAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAN1VzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0wvbGllZ2dpLXJzZS1ocGMyMC5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFYAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABzA==},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA2Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0wvbGllZ2dpLXJzZS1ocGMyMC1zbGlkZXMucGRmTxEBjAAAAAABjAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////G2xpZWdnaS1yc2UtaHBjMjAtc2xpZGVzLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////b9ZnGAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFMAAACAEAvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkw6bGllZ2dpLXJzZS1ocGMyMC1zbGlkZXMucGRmAA4AOAAbAGwAaQBlAGcAZwBpAC0AcgBzAGUALQBoAHAAYwAyADAALQBzAGwAaQBkAGUAcwAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAPlVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0wvbGllZ2dpLXJzZS1ocGMyMC1zbGlkZXMucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAF0AAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAAB7Q==}}
+
+@inproceedings{chakraborty:canopie20,
+ author = {Jayjeet Chakraborty and Carlos Maltzahn and Ivo Jimenez},
+ booktitle = {CANOPIE HPC 2020 (at SC20)},
+ date-added = {2020-11-30 07:28:21 -0800},
+ date-modified = {2022-04-11 19:55:33 -0700},
+ keywords = {papers, reproducibility, containers, workflowl, orchestration, nsf1836650},
+ month = {November 12},
+ title = {Enabling seamless execution of computational and data science workflows on HPC and cloud with the Popper container-native automation engine},
+ year = {2020},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA0Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0MvY2hha3JhYm9ydHktY2Fub3BpZTIwLnBkZk8RAYQAAAAAAYQAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xljaGFrcmFib3J0eS1jYW5vcGllMjAucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2+pOygAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABQwAAAgA+LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpDOmNoYWtyYWJvcnR5LWNhbm9waWUyMC5wZGYADgA0ABkAYwBoAGEAawByAGEAYgBvAHIAdAB5AC0AYwBhAG4AbwBwAGkAZQAyADAALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADxVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9DL2NoYWtyYWJvcnR5LWNhbm9waWUyMC5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQAWwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHj},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA8Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0MvY2hha3JhYm9ydHktY2Fub3BpZS0yMC1zbGlkZXMucGRmTxEBpAAAAAABpAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////H2NoYWtyYWJvcnR5LWNhbm9waSNGRkZGRkZGRi5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////b6k8SAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFDAAACAEYvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkM6Y2hha3JhYm9ydHktY2Fub3BpZS0yMC1zbGlkZXMucGRmAA4ARAAhAGMAaABhAGsAcgBhAGIAbwByAHQAeQAtAGMAYQBuAG8AcABpAGUALQAyADAALQBzAGwAaQBkAGUAcwAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIARFVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0MvY2hha3JhYm9ydHktY2Fub3BpZS0yMC1zbGlkZXMucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAGMAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAACCw==}}
+
+@article{lefevre:login20,
+ author = {Jeff LeFevre and Carlos Maltzahn},
+ date-added = {2020-06-12 18:36:51 -0700},
+ date-modified = {2020-07-01 12:34:36 -0700},
+ journal = {USENIX ;login:},
+ keywords = {papers, programmable, storage, ceph, physicaldesign, cross, nsf1836650, nsf1764102, nsf1705021},
+ number = {2},
+ title = {SkyhookDM: Data Processing in Ceph with Programmable Storage},
+ volume = {45},
+ year = {2020},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAuLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0wvbGVmZXZyZS1sb2dpbjIwLnBkZk8RAWwAAAAAAWwAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xNsZWZldnJlLWxvZ2luMjAucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2wl6sgAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABTAAAAgA4LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpMOmxlZmV2cmUtbG9naW4yMC5wZGYADgAoABMAbABlAGYAZQB2AHIAZQAtAGwAbwBnAGkAbgAyADAALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADZVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9ML2xlZmV2cmUtbG9naW4yMC5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQAVQAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHF}}
+
+@inproceedings{liu:hotedge20,
+ address = {Boston, MA},
+ author = {Jianshen Liu and Matthew Leon Curry and Carlos Maltzahn and Philip Kufeldt},
+ booktitle = {HotEdge'20},
+ date-added = {2020-04-19 12:38:42 -0700},
+ date-modified = {2020-07-01 12:35:59 -0700},
+ keywords = {papers, edge, reliability, disaggregation, embedded, failures, cross, nsf1836650, nsf1764102, nsf1705021},
+ month = {July 14},
+ title = {Scale-out Edge Storage Systems with Embedded Storage Nodes to Get Better Availability and Cost-Efficiency At the Same Time},
+ year = {2020},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAsLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0wvbGl1LWhvdGVkZ2UyMC5wZGZPEQFkAAAAAAFkAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8RbGl1LWhvdGVkZ2UyMC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9sdgrIAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUwAAAIANi86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6TDpsaXUtaG90ZWRnZTIwLnBkZgAOACQAEQBsAGkAdQAtAGgAbwB0AGUAZABnAGUAMgAwAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA0VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvTC9saXUtaG90ZWRnZTIwLnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABTAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbs=}}
+
+@inproceedings{chu:irishep20poster,
+ address = {Princeton, NJ},
+ author = {Aaron Chu and Ivo Jimenez and Jeff LeFevre and Carlos Maltzahn},
+ booktitle = {Poster at IRIS-HEP Poster Session},
+ date-added = {2020-03-09 22:19:08 -0700},
+ date-modified = {2020-07-01 12:36:40 -0700},
+ keywords = {poster, programmable, storage, hep, nsf1836650},
+ month = {February 27},
+ title = {SkyhookDM: Programmable Storage for Datasets},
+ year = {2020},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAyLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0MvY2h1LWlyaXNoZXAyMHBvc3Rlci5wZGZPEQF8AAAAAAF8AAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8XY2h1LWlyaXNoZXAyMHBvc3Rlci5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9qMb7UAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUMAAAIAPC86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6QzpjaHUtaXJpc2hlcDIwcG9zdGVyLnBkZgAOADAAFwBjAGgAdQAtAGkAcgBpAHMAaABlAHAAMgAwAHAAbwBzAHQAZQByAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA6VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvQy9jaHUtaXJpc2hlcDIwcG9zdGVyLnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABZAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAdk=}}
+
+@inproceedings{chakraborty:ecpam20,
+ author = {Jayjeet Chakraborty and Ivo Jimenez and Carlos Maltzahn and Arshul Mansoori and Quincy Wofford},
+ booktitle = {Poster at 2020 Exaxcale Computing Project Annual Meeting, Houston, TX, February 3-7, 2020},
+ date-added = {2020-02-05 11:34:01 -0800},
+ date-modified = {2022-04-11 19:54:42 -0700},
+ keywords = {shortpapers, reproducibility, containers, workflow, automation, cross, nsf1836650},
+ title = {Popper 2.0: A Container-native Workflow Execution Engine For Testing Complex Applications and Validating Scientific Claims},
+ year = {2020},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAyLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0MvY2hha3JhYm9ydHktZWNwYW0yMC5wZGZPEQF8AAAAAAF8AAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8XY2hha3JhYm9ydHktZWNwYW0yMC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9pgUJQAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUMAAAIAPC86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6QzpjaGFrcmFib3J0eS1lY3BhbTIwLnBkZgAOADAAFwBjAGgAYQBrAHIAYQBiAG8AcgB0AHkALQBlAGMAcABhAG0AMgAwAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA6VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvQy9jaGFrcmFib3J0eS1lY3BhbTIwLnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABZAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAdk=},
+ bdsk-url-1 = {https://ecpannualmeeting.com/}}
+
+@inproceedings{chu:chep19,
+ abstract = {Access libraries such as ROOT and HDF5 allow users to interact with datasets using high level abstractions, like coordinate systems and associated slicing operations. Unfortunately, the implementations of access libraries are based on outdated assumptions about storage systems interfaces and are generally unable to fully benefit from modern fast storage devices. For example, access libraries often implement buffering and data layout that assume that large, single-threaded sequential access patterns are causing less overall latency than small parallel random access: while this is true for spinning media, it is not true for flash media. The situation is getting worse with rapidly evolving storage devices such as non-volatile memory and ever larger datasets. Our Skyhook Dataset Mapping project explores distributed dataset mapping infrastructures that can integrate and scale out existing access libraries using Ceph's extensible object model, avoiding reimplementation or even modifications of these access libraries as much as possible. These programmable storage extensions coupled with our distributed dataset mapping techniques enable: 1) access library operations to be offloaded to storage system servers, 2) the independent evolution of access libraries and storage systems and 3) fully leveraging of the existing load balancing, elasticity, and failure management of distributed storage systems like Ceph. They also create more opportunities to conduct storage server-local optimizations specific to storage servers. For example, storage servers might include local key/value stores combined with chunk stores that require different optimizations than a local file system. As storage servers evolve to support new storage devices like non-volatile memory, these server-local optimizations can be implemented while minimizing disruptions to applications. We will report progress on the means by which distributed dataset mapping can be abstracted over particular access libraries, including access libraries for ROOT data, and how we address some of the challenges revolving around data partitioning and composability of access operations.},
+ address = {Adelaide, Australia},
+ author = {Aaron Chu and Jeff LeFevre and Carlos Maltzahn and Aldrin Montana and Peter Alvaro and Dana Robinson and Quincey Koziol},
+ booktitle = {24th International Conference on Computing in High Energy \& Nuclear Physics)},
+ date-added = {2020-01-20 16:19:51 -0800},
+ date-modified = {2020-07-30 14:13:11 -0700},
+ keywords = {papers, programmable, declarative, objectstorage, nsf1836650},
+ month = {November 4-8},
+ publisher = {arXiv:2007.01789v1 (Submitted for publication)},
+ title = {SkyhookDM: Mapping Scientific Datasets to Programmable Storage},
+ year = {2019},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxApLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0MvY2h1LWNoZXAxOS5wZGZPEQFaAAAAAAFaAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8OY2h1LWNoZXAxOS5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9tHl+cAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUMAAAIAMy86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6QzpjaHUtY2hlcDE5LnBkZgAADgAeAA4AYwBoAHUALQBjAGgAZQBwADEAOQAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAMVVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0MvY2h1LWNoZXAxOS5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFAAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABrg==},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAwLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0MvY2h1LWNoZXAxOS1zbGlkZXMucGRmTxEBdAAAAAABdAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////FWNodS1jaGVwMTktc2xpZGVzLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////bR5eSAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFDAAACADovOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkM6Y2h1LWNoZXAxOS1zbGlkZXMucGRmAA4ALAAVAGMAaAB1AC0AYwBoAGUAcAAxADkALQBzAGwAaQBkAGUAcwAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAOFVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0MvY2h1LWNoZXAxOS1zbGlkZXMucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFcAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABzw==},
+ bdsk-url-1 = {https://indico.cern.ch/event/773049/contributions/3474413/}}
+
+@inproceedings{weil:lsf07,
+ address = {San Jose, CA},
+ author = {Sage Weil and Scott A. Brandt and Carlos Maltzahn},
+ booktitle = {Linux Storage and Filesystem Workshop (LSF07), held in conjunction with the Conference on File and Storage Technology (FAST 07)},
+ date-added = {2019-12-29 16:46:38 -0800},
+ date-modified = {2019-12-29 16:46:38 -0800},
+ keywords = {shortpapers, storage, scalable},
+ month = {February 12--13},
+ title = {Scaling Linux Storage to Petabytes},
+ year = {2007},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxApLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1cvd2VpbC1sc2YwNy5wZGZPEQFaAAAAAAFaAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8Od2VpbC1sc2YwNy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9ouiPYAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAVcAAAIAMy86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6Vzp3ZWlsLWxzZjA3LnBkZgAADgAeAA4AdwBlAGkAbAAtAGwAcwBmADAANwAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAMVVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL1cvd2VpbC1sc2YwNy5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFAAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABrg==}}
+
+@inproceedings{estolano:fast08wip,
+ address = {San Jose, CA},
+ author = {Esteban Molina-Estolano and Carlos Maltzahn and Sage Weil and Scott Brandt},
+ booktitle = {Work-in-Progress Session of the USENIX Conference on File and Storage Technology (FAST 2008)},
+ date-added = {2019-12-29 16:38:04 -0800},
+ date-modified = {2019-12-29 16:39:22 -0800},
+ keywords = {shortpapers, loadbalancing, objectstorage, distributed, storage},
+ month = {February 26-29},
+ title = {Dynamic Load Balancing in Ceph},
+ year = {2008},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAzLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0UtRi9lc3RvbGFuby1mYXN0MDh3aXAucGRmTxEBgAAAAAABgAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////FmVzdG9sYW5vLWZhc3QwOHdpcC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////aLob5AAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAANFLUYAAAIAPS86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6RS1GOmVzdG9sYW5vLWZhc3QwOHdpcC5wZGYAAA4ALgAWAGUAcwB0AG8AbABhAG4AbwAtAGYAYQBzAHQAMAA4AHcAaQBwAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA7VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvRS1GL2VzdG9sYW5vLWZhc3QwOHdpcC5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFoAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAAB3g==}}
+
+@inproceedings{pye:fast08wip,
+ address = {San Jose, CA},
+ author = {Ian Pye and Scott Brandt and Carlos Maltzahn},
+ booktitle = {Work-in-Progress Session of the USENIX Conference on File and Storage Technology (FAST 2008)},
+ date-added = {2019-12-29 16:29:20 -0800},
+ date-modified = {2019-12-29 16:30:47 -0800},
+ keywords = {shortpapers, p2p, filesystems, global},
+ month = {February 26-29},
+ title = {Ringer: A Global-Scale Lightweight P2P File Service},
+ year = {2008},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAsLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1AvcHllLWZhc3QwOHdpcC5wZGZPEQFkAAAAAAFkAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8RcHllLWZhc3QwOHdpcC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9ouhUIAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAVAAAAIANi86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6UDpweWUtZmFzdDA4d2lwLnBkZgAOACQAEQBwAHkAZQAtAGYAYQBzAHQAMAA4AHcAaQBwAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA0VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvUC9weWUtZmFzdDA4d2lwLnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABTAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbs=}}
+
+@inproceedings{bigelow:fast08wip,
+ address = {San Jose, CA},
+ author = {David Bigelow and Scott A. Brandt and Carlos Maltzahn and Sage Weil},
+ booktitle = {Work-in-Progress Session of the USENIX Conference on File and Storage Technology (FAST 2008)},
+ date-added = {2019-12-29 16:25:47 -0800},
+ date-modified = {2019-12-29 16:31:55 -0800},
+ keywords = {shortpapers, raid, objectstorage},
+ month = {February 26-29},
+ title = {Adapting RAID Methods for Use in Object Storage Systems},
+ year = {2008},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAwLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0IvYmlnZWxvdy1mYXN0MDh3aXAucGRmTxEBdAAAAAABdAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////FWJpZ2Vsb3ctZmFzdDA4d2lwLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////aLoQoAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFCAAACADovOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkI6YmlnZWxvdy1mYXN0MDh3aXAucGRmAA4ALAAVAGIAaQBnAGUAbABvAHcALQBmAGEAcwB0ADAAOAB3AGkAcAAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAOFVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0IvYmlnZWxvdy1mYXN0MDh3aXAucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFcAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABzw==}}
+
+@inproceedings{maltzahn:fast08wip,
+ address = {San Jose, CA},
+ author = {Carlos Maltzahn},
+ booktitle = {Work-in-Progress Session of the USENIX Conference on File and Storage Technology (FAST 2008)},
+ date-added = {2019-12-29 16:18:24 -0800},
+ date-modified = {2020-01-04 20:29:07 -0700},
+ keywords = {shortpapers, filesystems, metadata, pim},
+ month = {February 26-29},
+ title = {How Private are Home Directories?},
+ year = {2008},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAxLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL00vbWFsdHphaG4tZmFzdDA4d2lwLnBkZk8RAXoAAAAAAXoAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xZtYWx0emFobi1mYXN0MDh3aXAucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2i6CqAAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABTQAAAgA7LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpNOm1hbHR6YWhuLWZhc3QwOHdpcC5wZGYAAA4ALgAWAG0AYQBsAHQAegBhAGgAbgAtAGYAYQBzAHQAMAA4AHcAaQBwAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA5VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvTS9tYWx0emFobi1mYXN0MDh3aXAucGRmAAATAAEvAAAVAAIAEf//AAAACAANABoAJABYAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAdY=}}
+
+@inproceedings{bhagwan:scc09,
+ address = {Bangalore, India},
+ author = {Varun Bhagwan and Carlos Maltzahn},
+ booktitle = {Work-In-Progress Session at 2009 IEEE International Conference on Services Computing (SCC 2009)},
+ date-added = {2019-12-29 16:11:09 -0800},
+ date-modified = {2019-12-29 16:11:52 -0800},
+ keywords = {shortpapers, crowdsourcing, metadata, filesystems},
+ month = {September 21--25},
+ title = {JabberWocky: Crowd-Sourcing Metadata for Files},
+ year = {2009},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAtLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0IvdmJoYWd3YW4tc2NjMDkucGRmTxEBagAAAAABagACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////EnZiaGFnd2FuLXNjYzA5LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////aLoEFAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFCAAACADcvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkI6dmJoYWd3YW4tc2NjMDkucGRmAAAOACYAEgB2AGIAaABhAGcAdwBhAG4ALQBzAGMAYwAwADkALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADVVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9CL3ZiaGFnd2FuLXNjYzA5LnBkZgAAEwABLwAAFQACABH//wAAAAgADQAaACQAVAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHC}}
+
+@inproceedings{wacha:fast10poster,
+ address = {San Jose, CA},
+ author = {Rosie Wacha and Scott A. Brandt and Carlos Maltzahn},
+ booktitle = {In Poster Session at the Conference on File and Storage Technology (FAST 2010)},
+ date-added = {2019-12-27 10:40:59 -0800},
+ date-modified = {2019-12-27 10:43:18 -0800},
+ keywords = {shortpapers, flash, RAID},
+ month = {February 24-27},
+ title = {RAID4S: Adding SSDs to RAID Arrays},
+ year = {2010},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAxLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1cvd2FjaGEtZmFzdDEwcG9zdGVyLnBkZk8RAXoAAAAAAXoAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xZ3YWNoYS1mYXN0MTBwb3N0ZXIucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2iuQ3AAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABVwAAAgA7LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpXOndhY2hhLWZhc3QxMHBvc3Rlci5wZGYAAA4ALgAWAHcAYQBjAGgAYQAtAGYAYQBzAHQAMQAwAHAAbwBzAHQAZQByAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA5VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvVy93YWNoYS1mYXN0MTBwb3N0ZXIucGRmAAATAAEvAAAVAAIAEf//AAAACAANABoAJABYAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAdY=},
+ bdsk-url-1 = {http://users.soe.ucsc.edu/~carlosm/Papers/S11.pdf}}
+
+@inproceedings{ames:fast10poster,
+ address = {San Jose, CA},
+ author = {Sasha Ames and Maya B. Gokhale and Carlos Maltzahn},
+ booktitle = {Poster Session at the Conference on File and Storage Technology (FAST 2010)},
+ date-added = {2019-12-26 20:23:07 -0800},
+ date-modified = {2019-12-29 16:32:23 -0800},
+ keywords = {shortpapers, filesystems, linking, metadata},
+ month = {February 24-27},
+ title = {Design and Implementation of a Metadata-Rich File System},
+ year = {2010},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAwLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0EvYW1lcy1mYXN0MTBwb3N0ZXIucGRmTxEBdAAAAAABdAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////FWFtZXMtZmFzdDEwcG9zdGVyLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////aKsejAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFBAAACADovOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkE6YW1lcy1mYXN0MTBwb3N0ZXIucGRmAA4ALAAVAGEAbQBlAHMALQBmAGEAcwB0ADEAMABwAG8AcwB0AGUAcgAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAOFVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0EvYW1lcy1mYXN0MTBwb3N0ZXIucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFcAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABzw==}}
+
+@inproceedings{polte:pdsw10poster,
+ address = {New Orleans, LA},
+ author = {Milo Polte, Esteban Molina-Estolan, John Bent and Garth Gibson and Carlos Maltzahn and Maya B. Gokhale and Scott Brandt},
+ booktitle = {Poster Session at 5th Petascale Data Storage Workshop (PDSW 2010), co-located with Supercomputing 2010},
+ date-added = {2019-12-26 20:08:27 -0800},
+ date-modified = {2019-12-29 16:32:38 -0800},
+ keywords = {shortpapers, parallel, filesystems, cloudcomputing},
+ month = {November 15},
+ title = {PLFS and HDFS: Enabling Parallel Filesystem Semantics In The Cloud},
+ year = {2010},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAxLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1AvcG9sdGUtcGRzdzEwcG9zdGVyLnBkZk8RAXoAAAAAAXoAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xZwb2x0ZS1wZHN3MTBwb3N0ZXIucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2irETwAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABUAAAAgA7LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpQOnBvbHRlLXBkc3cxMHBvc3Rlci5wZGYAAA4ALgAWAHAAbwBsAHQAZQAtAHAAZABzAHcAMQAwAHAAbwBzAHQAZQByAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA5VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvUC9wb2x0ZS1wZHN3MTBwb3N0ZXIucGRmAAATAAEvAAAVAAIAEf//AAAACAANABoAJABYAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAdY=},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA4Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1AvcG9sdGUtcGRzdzEwcG9zdGVyLXBvc3Rlci5wZGZPEQGUAAAAAAGUAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8dcG9sdGUtcGRzdzEwcG9zdGVyLXBvc3Rlci5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9oqxJIAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAVAAAAIAQi86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6UDpwb2x0ZS1wZHN3MTBwb3N0ZXItcG9zdGVyLnBkZgAOADwAHQBwAG8AbAB0AGUALQBwAGQAcwB3ADEAMABwAG8AcwB0AGUAcgAtAHAAbwBzAHQAZQByAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgBAVXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvUC9wb2x0ZS1wZHN3MTBwb3N0ZXItcG9zdGVyLnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABfAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAfc=},
+ bdsk-file-3 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA1Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1AvcG9sdGUtcGRzdzEwcG9zdGVyLXdpcC5wZGZPEQGKAAAAAAGKAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8acG9sdGUtcGRzdzEwcG9zdGVyLXdpcC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9oqxLsAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAVAAAAIAPy86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6UDpwb2x0ZS1wZHN3MTBwb3N0ZXItd2lwLnBkZgAADgA2ABoAcABvAGwAdABlAC0AcABkAHMAdwAxADAAcABvAHMAdABlAHIALQB3AGkAcAAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAPVVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL1AvcG9sdGUtcGRzdzEwcG9zdGVyLXdpcC5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFwAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAAB6g==}}
+
+@inproceedings{ames:pdsw10poster,
+ address = {New Orleans, LA},
+ author = {Sasha Ames and Maya B. Gokhale and Carlos Maltzahn},
+ booktitle = {Session at 5th Petascale Data Storage Workshop (PDSW 2010), co-located with Supercomputing 2010},
+ date-added = {2019-12-26 20:05:01 -0800},
+ date-modified = {2019-12-29 16:32:49 -0800},
+ keywords = {shortpapers, linking, filesystems, metadata},
+ month = {November 15},
+ title = {QMDS: A File System Metadata Service Supporting a Graph Data Model-Based Query Language},
+ year = {2010},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAwLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0EvYW1lcy1wZHN3MTBwb3N0ZXIucGRmTxEBdAAAAAABdAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////FWFtZXMtcGRzdzEwcG9zdGVyLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////aKsNwAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFBAAACADovOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkE6YW1lcy1wZHN3MTBwb3N0ZXIucGRmAA4ALAAVAGEAbQBlAHMALQBwAGQAcwB3ADEAMABwAG8AcwB0AGUAcgAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAOFVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0EvYW1lcy1wZHN3MTBwb3N0ZXIucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFcAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABzw==}}
+
+@inproceedings{skourtis:fast13wip,
+ address = {San Jose, CA},
+ author = {Dimitris Skourtis and Scott A. Brandt and Carlos Maltzahn},
+ booktitle = {Work-in-Progress and Poster Session at the Conference on File and Storage Technology (FAST 2013)},
+ date-added = {2019-12-26 19:57:02 -0800},
+ date-modified = {2019-12-29 16:34:24 -0800},
+ keywords = {shortpapers, performance, predictable, flash, redundancy},
+ month = {February 12-15},
+ title = {High Performance \& Low Latency in Solid-State Drives Through Redundancy},
+ year = {2013},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAxLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1Mvc2tvdXJ0aXMtZmFzdDEzd2lwLnBkZk8RAXoAAAAAAXoAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xZza291cnRpcy1mYXN0MTN3aXAucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2irBvQAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABUwAAAgA7LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpTOnNrb3VydGlzLWZhc3QxM3dpcC5wZGYAAA4ALgAWAHMAawBvAHUAcgB0AGkAcwAtAGYAYQBzAHQAMQAzAHcAaQBwAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA5VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvUy9za291cnRpcy1mYXN0MTN3aXAucGRmAAATAAEvAAAVAAIAEf//AAAACAANABoAJABYAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAdY=},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA4Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1Mvc2tvdXJ0aXMtZmFzdDEzd2lwLXBvc3Rlci5wZGZPEQGUAAAAAAGUAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8dc2tvdXJ0aXMtZmFzdDEzd2lwLXBvc3Rlci5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9oqwfcAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAVMAAAIAQi86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6Uzpza291cnRpcy1mYXN0MTN3aXAtcG9zdGVyLnBkZgAOADwAHQBzAGsAbwB1AHIAdABpAHMALQBmAGEAcwB0ADEAMwB3AGkAcAAtAHAAbwBzAHQAZQByAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgBAVXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvUy9za291cnRpcy1mYXN0MTN3aXAtcG9zdGVyLnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABfAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAfc=}}
+
+@inproceedings{lofstead:cluster14poster,
+ address = {Madrid, Spain},
+ author = {Jay Lofstead and Ivo Jimenez and Carlos Maltzahn and Quincey Koziol and John Bent and Eric Barton},
+ booktitle = {in Poster Session at IEEE Cluster 2014},
+ date-added = {2019-12-26 19:23:07 -0800},
+ date-modified = {2019-12-29 16:34:56 -0800},
+ keywords = {shortpapers, storage, parallel, hpc, exascale},
+ month = {September 22-26},
+ title = {An Innovative Storage Stack Addressing Extreme Scale Platforms and Big Data Applications},
+ year = {2014},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA4Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0wvbG9mc3RlYWQtY2x1c3RlcjE0LXBvc3Rlci5wZGZPEQGUAAAAAAGUAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8dbG9mc3RlYWQtY2x1c3RlcjE0LXBvc3Rlci5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9/k+CEAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUwAAAIAQi86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6TDpsb2ZzdGVhZC1jbHVzdGVyMTQtcG9zdGVyLnBkZgAOADwAHQBsAG8AZgBzAHQAZQBhAGQALQBjAGwAdQBzAHQAZQByADEANAAtAHAAbwBzAHQAZQByAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgBAVXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvTC9sb2ZzdGVhZC1jbHVzdGVyMTQtcG9zdGVyLnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABfAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAfc=}}
+
+@inproceedings{sevilla:fast14wip,
+ address = {San Jose, CA},
+ author = {Michael Sevilla and Scott Brandt and Carlos Maltzahn and Ike Nassi and Sam Fineberg},
+ booktitle = {Work-in-Progress and Poster Session at the 12th USENIX Conference on File and Storage Technology (FAST 2014)},
+ date-added = {2019-12-26 19:20:27 -0800},
+ date-modified = {2019-12-29 16:35:02 -0800},
+ keywords = {shortpapers, filesystems, metadata, loadbalancing},
+ month = {February 17-20},
+ title = {Exploring Resource Migration using the CephFS Metadata cluster},
+ year = {2014},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA0Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1Mvc2V2aWxsYS1mYXN0MTQtcG9zdGVyLnBkZk8RAYQAAAAAAYQAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xlzZXZpbGxhLWZhc3QxNC1wb3N0ZXIucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////3+T2JgAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABUwAAAgA+LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpTOnNldmlsbGEtZmFzdDE0LXBvc3Rlci5wZGYADgA0ABkAcwBlAHYAaQBsAGwAYQAtAGYAYQBzAHQAMQA0AC0AcABvAHMAdABlAHIALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADxVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9TL3NldmlsbGEtZmFzdDE0LXBvc3Rlci5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQAWwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHj}}
+
+@inproceedings{kufeldt:fast18wip,
+ address = {Oakland, CA},
+ author = {Philip Kufeldt and Timothy Feldman and Christine Green and Grant Mackey and Carlos Maltzahn and Shingo Tanaka},
+ booktitle = {WiP and Poster Sessions at 16th USENIX Conference on File and Storage Technologies (FAST'18)},
+ date-added = {2019-12-26 19:17:05 -0800},
+ date-modified = {2019-12-29 16:35:11 -0800},
+ keywords = {shortpapers, eusocial, embedded, storage},
+ month = {Feb 12-15},
+ title = {Eusocial Storage Devices},
+ year = {2018},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA2Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0sva3VmZWxkLWZhc3QxOHdpcC1wb3N0ZXIucGRmTxEBjAAAAAABjAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////G2t1ZmVsZC1mYXN0MTh3aXAtcG9zdGVyLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////Wp7P3AAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFLAAACAEAvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOks6a3VmZWxkLWZhc3QxOHdpcC1wb3N0ZXIucGRmAA4AOAAbAGsAdQBmAGUAbABkAC0AZgBhAHMAdAAxADgAdwBpAHAALQBwAG8AcwB0AGUAcgAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAPlVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0sva3VmZWxkLWZhc3QxOHdpcC1wb3N0ZXIucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAF0AAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAAB7Q==}}
+
+@inproceedings{jimenez:xldb18,
+ address = {Stanford, CA},
+ author = {Ivo Jimenez and Carlos Maltzahn},
+ booktitle = {Lightning Talk and Poster Session at the 11th Extremely Large Databases Conference (XLDB)},
+ date-added = {2019-12-26 19:14:42 -0800},
+ date-modified = {2019-12-29 16:35:19 -0800},
+ keywords = {shortpapers, reproducibility},
+ month = {April 30},
+ title = {Reproducible Computational and Data-Intensive Experimentation Pipelines with Popper},
+ year = {2018},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA2Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0ktSi9qaW1lbmV6LXhsZGIxOC1zbGlkZXMucGRmTxEBigAAAAABigACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////GWppbWVuZXoteGxkYjE4LXNsaWRlcy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////f5PLgAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAANJLUoAAAIAQC86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6SS1KOmppbWVuZXoteGxkYjE4LXNsaWRlcy5wZGYADgA0ABkAagBpAG0AZQBuAGUAegAtAHgAbABkAGIAMQA4AC0AcwBsAGkAZABlAHMALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASAD5Vc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9JLUovamltZW5lei14bGRiMTgtc2xpZGVzLnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABdAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAes=},
+ bdsk-url-1 = {https://www.youtube.com/watch?v=HXk_nVq8D00&list=PLE1UFlsTj5AHNXntohlhH9nYgXGU2ZqOU&index=32}}
+
+@inproceedings{maltzahn:hotstorage18-breakout,
+ address = {Boston, MA},
+ author = {Carlos Maltzahn},
+ booktitle = {Breakouts Session abstract at 10th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage'18, co-located with USENIX ATC'18)},
+ date-added = {2019-12-26 19:10:01 -0800},
+ date-modified = {2020-01-19 16:20:17 -0800},
+ keywords = {shortpapers, storage, embedded, eusocial, programmable},
+ month = {July 9-10},
+ title = {Should Storage Devices Stay Dumb or Become Smart?},
+ year = {2018},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA9Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL00vbWFsdHphaG4taG90c3RvcmFnZTE4LWJyZWFrb3V0LnBkZk8RAaoAAAAAAaoAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////x9tYWx0emFobi1ob3RzdG9yYWcjRkZGRkZGRkYucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2iq2rgAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABTQAAAgBHLzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpNOm1hbHR6YWhuLWhvdHN0b3JhZ2UxOC1icmVha291dC5wZGYAAA4ARgAiAG0AYQBsAHQAegBhAGgAbgAtAGgAbwB0AHMAdABvAHIAYQBnAGUAMQA4AC0AYgByAGUAYQBrAG8AdQB0AC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgBFVXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvTS9tYWx0emFobi1ob3RzdG9yYWdlMTgtYnJlYWtvdXQucGRmAAATAAEvAAAVAAIAEf//AAAACAANABoAJABkAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAhI=},
+ bdsk-url-1 = {https://docs.google.com/presentation/d/1yvXWpxfNWZ4NIL9GLLWM_e3TAm-8Mu-EfAygo1SRRlg/edit?usp=sharing},
+ bdsk-url-2 = {https://docs.google.com/document/d/1Vfuoy2H8Mg2PrweO5I2sP04gAZonhUIxE3_W9oMFhwI/edit?usp=sharing}}
+
+@inproceedings{kufeldt:fast19poster,
+ address = {Boston, MA},
+ author = {Philip Kufeldt and Jianshen Liu and Carlos Maltzahn},
+ booktitle = {Poster Session at 17th USENIX Conference on File and Storage Technologies (FAST'19)},
+ date-added = {2019-12-26 19:07:25 -0800},
+ date-modified = {2019-12-29 16:35:40 -0800},
+ keywords = {shortpapers, reproducibility, embedded, storage, eusocial},
+ month = {Februrary 25-28},
+ title = {MBWU (MibeeWu): Quantifying benefits of offloading data management to storage devices},
+ year = {2019},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxBTLi4vLi4vLi4vTXkgRHJpdmUvU3VibWlzc2lvbnMvYXJjaGl2ZS8yMDE5LzIwMTkwMTE1IEZBU1QxOVdJUC9rdWZlbGR0LWZhc3QxOXdpcC5wZGZPEQHMAAAAAAHMAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8Va3VmZWxkdC1mYXN0MTl3aXAucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9hkMrEAAAAAAAAAAAADAAYAAAogY3UAAAAAAAAAAAAAAAAAEjIwMTkwMTE1IEZBU1QxOVdJUAACAF0vOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6U3VibWlzc2lvbnM6YXJjaGl2ZToyMDE5OjIwMTkwMTE1IEZBU1QxOVdJUDprdWZlbGR0LWZhc3QxOXdpcC5wZGYAAA4ALAAVAGsAdQBmAGUAbABkAHQALQBmAGEAcwB0ADEAOQB3AGkAcAAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAW1VzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvU3VibWlzc2lvbnMvYXJjaGl2ZS8yMDE5LzIwMTkwMTE1IEZBU1QxOVdJUC9rdWZlbGR0LWZhc3QxOXdpcC5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAHoAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAACSg==},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxCbLi4vLi4vLi4vTXkgRHJpdmUvU3VibWlzc2lvbnMvYXJjaGl2ZS8yMDE5LzIwMTkwMTE1IEZBU1QxOVdJUC9RdWFudGlmeWluZyBiZW5lZml0cyBvZiBvZmZsb2FkaW5nIGRhdGEgbWFuYWdlbWVudCB0byBzdG9yYWdlIGRldmljZXMgKFBvc3RlcikgKEZBU1QgJzE5KS5wZGZPEQLsAAAAAALsAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8fUXVhbnRpZnlpbmcgYmVuZWZpI0ZGRkZGRkZGLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9kHWegAAAAAAAAAAAADAAYAAAogY3UAAAAAAAAAAAAAAAAAEjIwMTkwMTE1IEZBU1QxOVdJUAACAKUvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6U3VibWlzc2lvbnM6YXJjaGl2ZToyMDE5OjIwMTkwMTE1IEZBU1QxOVdJUDpRdWFudGlmeWluZyBiZW5lZml0cyBvZiBvZmZsb2FkaW5nIGRhdGEgbWFuYWdlbWVudCB0byBzdG9yYWdlIGRldmljZXMgKFBvc3RlcikgKEZBU1QgJzE5KS5wZGYAAA4AvABdAFEAdQBhAG4AdABpAGYAeQBpAG4AZwAgAGIAZQBuAGUAZgBpAHQAcwAgAG8AZgAgAG8AZgBmAGwAbwBhAGQAaQBuAGcAIABkAGEAdABhACAAbQBhAG4AYQBnAGUAbQBlAG4AdAAgAHQAbwAgAHMAdABvAHIAYQBnAGUAIABkAGUAdgBpAGMAZQBzACAAKABQAG8AcwB0AGUAcgApACAAKABGAEEAUwBUACAAJwAxADkAKQAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAo1VzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvU3VibWlzc2lvbnMvYXJjaGl2ZS8yMDE5LzIwMTkwMTE1IEZBU1QxOVdJUC9RdWFudGlmeWluZyBiZW5lZml0cyBvZiBvZmZsb2FkaW5nIGRhdGEgbWFuYWdlbWVudCB0byBzdG9yYWdlIGRldmljZXMgKFBvc3RlcikgKEZBU1QgJzE5KS5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAMIAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAADsg==}}
+
+@inproceedings{lefevre:vault20,
+ address = {Santa Clara, CA},
+ author = {Jeff LeFevre and Carlos Maltzahn},
+ booktitle = {2020 Linux Storage and Filesystems Conference (Vault'20, co-located with FAST'20 and NSDI'20)},
+ date-added = {2019-12-26 19:04:52 -0800},
+ date-modified = {2020-07-01 12:40:06 -0700},
+ keywords = {shortpapers, programmable, storage, physicaldesign, nsf1836650, nsf1764102, nsf1705021},
+ month = {February 24-25},
+ title = {Scaling databases and file APIs with programmable Ceph object storage},
+ year = {2020},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAuLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0wvbGVmZXZyZS12YXVsdDIwLnBkZk8RAWwAAAAAAWwAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xNsZWZldnJlLXZhdWx0MjAucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2oBm3wAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABTAAAAgA4LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpMOmxlZmV2cmUtdmF1bHQyMC5wZGYADgAoABMAbABlAGYAZQB2AHIAZQAtAHYAYQB1AGwAdAAyADAALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADZVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9ML2xlZmV2cmUtdmF1bHQyMC5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQAVQAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHF}}
+
+@article{ellis:jbcs94,
+ author = {Clarence E. Ellis and Carlos Maltzahn},
+ date-added = {2019-12-26 18:50:02 -0800},
+ date-modified = {2019-12-26 18:51:29 -0800},
+ journal = {Journal of the Brazilian Computer Society, Special Edition on CSCW},
+ keywords = {papers, cscw},
+ number = {1},
+ pages = {15--23},
+ title = {Collaboration with Spreadsheets},
+ volume = {1},
+ year = {1994},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAtLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0UtRi9lbGxpcy1qYmNzOTQucGRmTxEBaAAAAAABaAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////EGVsbGlzLWpiY3M5NC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////aKrG7AAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAANFLUYAAAIANy86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6RS1GOmVsbGlzLWpiY3M5NC5wZGYAAA4AIgAQAGUAbABsAGkAcwAtAGoAYgBjAHMAOQA0AC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA1VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvRS1GL2VsbGlzLWpiY3M5NC5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFQAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABwA==}}
+
+@article{jimenez:tinytocs16,
+ abstract = {Validating experimental results in the field of computer systems is a challenging task, mainly due to the many changes in software and hardware that computational environments go through. Determining if an experiment is reproducible entails two separate tasks: re-executing the experiment and validating the results. Existing reproducibility efforts have focused on the former, envisioning techniques and infrastructures that make it easier to re-execute an experiment. By focusing on the latter and analyzing the validation workflow that an experiment re-executioner goes through, we notice that validating results is done on the basis of experiment design and high-level goals, rather than exact quantitative metrics.
+Based on this insight, we introduce a declarative format for describing the high-level components of an experiment, as well as a language for specifying generic, testable statements that serve as the basis for validation [1,2]. Our language allows to express and validate statements on top of metrics gathered at runtime. We demonstrate the feasibility of this approach by taking an experiment from an already published article and obtain the corresponding experiment specification. We show that, if we had this specification in the first place, validating the original findings would be an almost entirely automated task. If we contrast this with the current state of our practice, where it takes days or weeks (if successful) to reproduce results, we see how making experiment specifications available as part of a publication or as addendum to experimental results can significantly aid in the validation of computer systems research.
+Acknowledgements: Work performed under auspices of US DOE by LLNL contract DE-AC52- 07NA27344 ABS-684863 and by SNL contract DE-AC04-94AL85000.},
+ author = {Ivo Jimenez and Carlos Maltzahn and Jay Lofstead and Adam Moody and Kathryn Mohror and Remzi Arpaci-Dusseau and Andrea Arpaci-Dusseau},
+ date-added = {2019-12-26 18:43:34 -0800},
+ date-modified = {2020-01-04 21:15:26 -0700},
+ journal = {Tiny Transactions on Computer Science (TinyToCS)},
+ keywords = {papers, reproducibility, evaluation},
+ title = {I Aver: Providing Declarative Experiment Specifications Facilitates the Evaluation of Computer Systems Research},
+ volume = {4},
+ year = {2016},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAzLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0ktSi9qaW1lbmV6LXRpbnl0b2NzMTYucGRmTxEBgAAAAAABgAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////FmppbWVuZXotdGlueXRvY3MxNi5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////aKrBKAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAANJLUoAAAIAPS86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6SS1KOmppbWVuZXotdGlueXRvY3MxNi5wZGYAAA4ALgAWAGoAaQBtAGUAbgBlAHoALQB0AGkAbgB5AHQAbwBjAHMAMQA2AC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA7VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvSS1KL2ppbWVuZXotdGlueXRvY3MxNi5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFoAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAAB3g==}}
+
+@inproceedings{maltzahn:vkika91,
+ abstract = {Die meisten CAD-Umgebungen betonen die Unterstatzung einzelner Arbeitspliitze und helfen nur sekundiir bei deren Kooperation. Wir schlagen einen umgekehrten Ansatz vor: Entwiirfe entstehen im Rahmen von interagierenden Sharing-Prozessen, die den gemeinsamen Zugang aller Beteiligten zu Konzepten, Aufgaben und Ergebnissen strukturieren. Dieser Ansatz und seine Konsequenzen werden am Beispiel des Software Engineering dargestellt. Aufder Basis einer Formalisierung dieser Prozesse steuert der ConceptTalk-Prototyp eine verteilte Softwareumgebung und speziel/e Kommunikationswerkzeuge aber das Wissensbanksystem ConceptBase. Erfahrungen mit ConceptTalk unterstatzen ein neues Paradigma, das ein Informationssystem als Medium for komplexe Kommunikation betrachtet.},
+ author = {Carlos Maltzahn and Thomas Rose},
+ booktitle = {Verteilte K{\"u}nstliche Intelligenz und kooperatives Arbeiten},
+ date-added = {2019-12-26 18:32:03 -0800},
+ date-modified = {2020-01-04 21:16:07 -0700},
+ editor = {W. Brauer and D. Hern{\'a}ndez},
+ keywords = {papers, cscw, softwareengineering},
+ pages = {195--206},
+ publisher = {Springer-Verlag Berlin Heidelberg},
+ title = {ConceptTalk: Kooperationsunterst{\"u}tzung in Softwareumgebungen},
+ volume = {291},
+ year = {1991},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAvLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL00vbWFsdHphaG4tdmtpa2E5MS5wZGZPEQFyAAAAAAFyAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8UbWFsdHphaG4tdmtpa2E5MS5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9oqrXUAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAU0AAAIAOS86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6TTptYWx0emFobi12a2lrYTkxLnBkZgAADgAqABQAbQBhAGwAdAB6AGEAaABuAC0AdgBrAGkAawBhADkAMQAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAN1VzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL00vbWFsdHphaG4tdmtpa2E5MS5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFYAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABzA==}}
+
+@inproceedings{leung:msst07,
+ abstract = {Achieving performance, reliability, and scalability presents a unique set of challenges for large distributed storage. To identify problem areas, there must be a way for developers to have a comprehensive view of the entire storage system. That is, users must be able to understand both node specific behavior and complex relationships between nodes. We present a distributed file system profiling method that supports such analysis. Our approach is based on combining node-specific metrics into a single cohesive system image. This affords users two views of the storage system: a micro, per-node view, as well as, a macro, multi- node view, allowing both node-specific and complex inter- nodal problems to be debugged. We visualize the storage system by displaying nodes and intuitively animating their metrics and behavior allowing easy analysis of complex problems.},
+ address = {Santa Clara, CA},
+ author = {Andrew Leung and Eric Lalonde and Jacob Telleen and James Davis and Carlos Maltzahn},
+ booktitle = {Proceedings of the 24th IEEE Conference on Mass Storage Systems and Technologies (MSST 2007)},
+ date-added = {2019-12-26 18:07:11 -0800},
+ date-modified = {2020-01-04 21:16:58 -0700},
+ keywords = {papers, performance, debuggung, distributed, storage, systems},
+ month = {September},
+ title = {Using Comprehensive Analysis for Performance Debugging in Distributed Storage Systems},
+ year = {2007},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxArLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0wvbGV1bmctbXNzdDA3LnBkZk8RAWIAAAAAAWIAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xBsZXVuZy1tc3N0MDcucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2iqn9gAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABTAAAAgA1LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpMOmxldW5nLW1zc3QwNy5wZGYAAA4AIgAQAGwAZQB1AG4AZwAtAG0AcwBzAHQAMAA3AC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgAzVXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvTC9sZXVuZy1tc3N0MDcucGRmAAATAAEvAAAVAAIAEf//AAAACAANABoAJABSAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbg=}}
+
+@inproceedings{lofstead:pdsw13,
+ abstract = {The rise of Integrated Application Workflows (IAWs) for processing data prior to storage on persistent media prompts the need to incorporate features that reproduce many of the semantics of persistent storage devices. One such feature is the ability to manage data sets as chunks with natural barriers between different data sets. Towards that end, we need a mechanism to ensure that data moved to an intermediate storage area is both complete and correct before allowing access by other processing components. The Dou- bly Distributed Transactions (D2T) protocol offers such a mechanism. The initial development [9] suffered from scal- ability limitations and undue requirements on server processes. The current version has addressed these limitations and has demonstrated scalability with low overhead.},
+ address = {Denver, CO},
+ author = {Jay Lofstead and Jai Dayal and Ivo Jimenez and Carlos Maltzahn},
+ booktitle = {8th Parallel Data Storage Workshop at Supercomputing '13 (PDSW 2013)},
+ date-added = {2019-12-26 16:21:31 -0800},
+ date-modified = {2020-01-04 21:17:41 -0700},
+ keywords = {papers, transactions, datamanagement, hpc},
+ month = {November 18},
+ title = {Efficient Transactions for Parallel Data Movement},
+ year = {2013},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAuLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0wvbG9mc3RlYWQtcGRzdzEzLnBkZk8RAWwAAAAAAWwAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xNsb2ZzdGVhZC1wZHN3MTMucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2iql+wAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABTAAAAgA4LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpMOmxvZnN0ZWFkLXBkc3cxMy5wZGYADgAoABMAbABvAGYAcwB0AGUAYQBkAC0AcABkAHMAdwAxADMALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADZVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9ML2xvZnN0ZWFkLXBkc3cxMy5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQAVQAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHF}}
+
+@inproceedings{lofstead:iasds14,
+ abstract = {The DOE Extreme-Scale Technology Acceleration Fast Forward Storage and IO Stack project is going to have significant impact on storage systems design within and beyond the HPC community. With phase 1 of the project complete, it is an excellent opportunity to evaluate many of the decisions made to feed into the phase 2 effort. With this paper we not only provide a timely summary of important aspects of the design specifications but also capture the underlying reasoning that is not available elsewhere.
+The initial effort to define a next generation storage system has made admirable contributions in architecture and design. Formalizing the general idea of data staging into burst buffers for the storage system will help manage the performance variability and offer additional data processing opportunities outside the main compute and storage system. Adding a transactional mech- anism to manage faults and data visibility helps enable effective analytics without having to work around the IO stack semantics. While these and other contributions are valuable, similar efforts made elsewhere may offer attractive alternatives or differing semantics that could yield a more feature rich environment with little to no additional overhead. For example, the Doubly Distributed Transactions (D2T) protocol offers an alternative approach for incorporating transactional semantics into the data path. Another project, PreDatA, examined how to get the best throughput for data operators and may offer additional insights into further refinements of the Burst Buffer concept.
+This paper examines some of the choices made by the Fast Forward team and compares them with other options and offers observations and suggestions based on these other efforts. This will include some non-core contributions of other projects, such as some of the demonstration metadata and data storage components generated while implementing D2T, to make suggestions that may help the next generation design for how the IO stack works as a whole.},
+ address = {Minneapolis, MN},
+ author = {Jay Lofstead and Ivo Jimenez and Carlos Maltzahn},
+ booktitle = {Workshop on Interfaces and Architectures for Scientific Data Storage (IASDS 2014)},
+ date-added = {2019-12-26 16:17:49 -0800},
+ date-modified = {2020-01-04 23:08:26 -0700},
+ keywords = {papers, datamanagement, hpc},
+ month = {September 9-12},
+ title = {Consistency and Fault Tolerance Considerations for the Next Iteration of the DOE Fast Forward Storage and IO Project},
+ year = {2014},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAvLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0wvbG9mc3RlYWQtaWFzZHMxNC5wZGZPEQFyAAAAAAFyAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8UbG9mc3RlYWQtaWFzZHMxNC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9oqjgAAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUwAAAIAOS86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6TDpsb2ZzdGVhZC1pYXNkczE0LnBkZgAADgAqABQAbABvAGYAcwB0AGUAYQBkAC0AaQBhAHMAZABzADEANAAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAN1VzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0wvbG9mc3RlYWQtaWFzZHMxNC5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFYAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABzA==}}
+
+@inproceedings{lofstead:discs14,
+ abstract = {Scientific simulations are moving away from using centralized persistent storage for intermediate data between workflow steps towards an all online model. This shift is motivated by the relatively slow IO bandwidth growth compared with compute speed increases. The challenges presented by this shift to Integrated Application Workflows are motivated by the loss of persistent storage semantics for node-to-node communication. One step towards addressing this semantics gap is using transac- tions to logically delineate a data set from 100,000s of processes to 1000s of servers as an atomic unit.
+Our previously demonstrated Doubly Distributed Transac- tions (D2T) protocol showed a high-performance solution, but had not explored how to detect and recover from faults. Instead, the focus was on demonstrating high-performance typical case performance. The research presented here addresses fault detec- tion and recovery based on the enhanced protocol design. The total overhead for a full transaction with multiple operations at 65,536 processes is on average 0.055 seconds. Fault detection and recovery mechanisms demonstrate similar performance to the success case with only the addition of appropriate timeouts for the system. This paper explores the challenges in designing a recoverable protocol for doubly distributed transactions, partic- ularly for parallel computing environments.},
+ address = {New Orleans, LA},
+ author = {Jay Lofstead and Jai Dayal and Ivo Jimenez and Carlos Maltzahn},
+ booktitle = {The 2014 International Workshop on Data-Intensive Scalable Computing Systems (DISCS-2014) (Workshop co-located with Supercomputing 2014)},
+ date-added = {2019-12-26 16:14:45 -0800},
+ date-modified = {2020-01-04 21:18:57 -0700},
+ keywords = {papers, datamanagement, hpc},
+ month = {November 16},
+ title = {Efficient, Failure Resilient Transactions for Parallel and Distributed Computing},
+ year = {2014},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAvLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0wvbG9mc3RlYWQtZGlzY3MxNC5wZGZPEQFyAAAAAAFyAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8UbG9mc3RlYWQtZGlzY3MxNC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9oqjVsAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUwAAAIAOS86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6TDpsb2ZzdGVhZC1kaXNjczE0LnBkZgAADgAqABQAbABvAGYAcwB0AGUAYQBkAC0AZABpAHMAYwBzADEANAAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAN1VzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0wvbG9mc3RlYWQtZGlzY3MxNC5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFYAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABzA==}}
+
+@inproceedings{jimenez:woc15,
+ abstract = {Evaluating experimental results in the field of com- puter systems is a challenging task, mainly due to the many changes in software and hardware that computational environ- ments go through. In this position paper, we analyze salient features of container technology that, if leveraged correctly, can help reduce the complexity of reproducing experiments in systems research. We present a use case in the area of distributed storage systems to illustrate the extensions that we envision, mainly in terms of container management infrastructure. We also discuss the benefits and limitations of using containers as a way of reproducing research in other areas of experimental systems research.},
+ address = {Tempe, AZ},
+ author = {Ivo Jimenez and Carlos Maltzahn and Adam Moody and Kathryn Mohror and Jay Lofstead and Remzi Arpaci-Dusseau and Andrea Arpaci-Dusseau},
+ booktitle = {First Workshop on Containers (WoC 2015) (Workshop co-located with IEEE International Conference on Cloud Engineering - IC2E 2015)},
+ date-added = {2019-12-26 16:08:16 -0800},
+ date-modified = {2020-01-19 16:41:52 -0800},
+ keywords = {papers, reproducibility, containers},
+ month = {March 9-13},
+ title = {The Role of Container Technology in Reproducible Computer Systems Research},
+ year = {2015},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAuLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0ktSi9qaW1lbmV6LXdvYzE1LnBkZk8RAWoAAAAAAWoAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xFqaW1lbmV6LXdvYzE1LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2iqMtQAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAADSS1KAAACADgvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkktSjpqaW1lbmV6LXdvYzE1LnBkZgAOACQAEQBqAGkAbQBlAG4AZQB6AC0AdwBvAGMAMQA1AC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA2VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvSS1KL2ppbWVuZXotd29jMTUucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFUAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABww==}}
+
+@inproceedings{lofstead:sc16,
+ abstract = {The DOE Extreme-Scale Technology Acceleration Fast Forward Storage and IO Stack project is going to have significant impact on storage systems design within and beyond the HPC community. With phase two of the project starting, it is an excellent opportunity to explore the complete design and how it will address the needs of extreme scale platforms. This paper examines each layer of the proposed stack in some detail along with cross-cutting topics, such as transactions and metadata management.
+This paper not only provides a timely summary of important aspects of the design specifications but also captures the under- lying reasoning that is not available elsewhere. We encourage the broader community to understand the design, intent, and future directions to foster discussion guiding phase two and the ultimate production storage stack based on this work. An initial performance evaluation of the early prototype implementation is also provided to validate the presented design.
+},
+ address = {Salt Lake City, UT},
+ author = {Jay Lofstead and Ivo Jimenez and Carlos Maltzahn and Quincey Koziol and John Bent and Eric Barton},
+ booktitle = {29th ACM and IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC16)},
+ date-added = {2019-12-26 15:58:41 -0800},
+ date-modified = {2020-01-04 21:19:51 -0700},
+ keywords = {papers, parallel, storage, hpc, exascale},
+ month = {November 13-18},
+ title = {DAOS and Friends: A Proposal for an Exascale Storage System},
+ year = {2016},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAsLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0wvbG9mc3RlYWQtc2MxNi5wZGZPEQFkAAAAAAFkAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8RbG9mc3RlYWQtc2MxNi5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9mAdiIAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUwAAAIANi86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6TDpsb2ZzdGVhZC1zYzE2LnBkZgAOACQAEQBsAG8AZgBzAHQAZQBhAGQALQBzAGMAMQA2AC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA0VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvTC9sb2ZzdGVhZC1zYzE2LnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABTAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbs=}}
+
+@inproceedings{jimenez:icpe18,
+ abstract = {We introduce quiho, a framework for profiling application performance that can be used in automated performance regression tests. quiho profiles an application by applying sensitivity analysis, in particular statistical regression analysis (SRA), using application-independent performance feature vectors that characterize the performance of machines. The result of the SRA, feature importance specifically, is used as a proxy to identify hardware and low-level system software behavior. The relative importance of these features serve as a performance profile of an application (termed inferred resource utilization profile or IRUP), which is used to automatically validate performance behavior across multiple revisions of an application's code base without having to instrument code or obtain performance counters. We demonstrate that quiho can successfully discover performance regressions by showing its effectiveness in profiling application performance for synthetically introduced regressions as well as those found in real-world applications.},
+ address = {Berlin, Germany},
+ author = {Ivo Jimenez and Noah Watkins and Michael Sevilla and Jay Lofstead and Carlos Maltzahn},
+ booktitle = {9th ACM/SPEC International Conference on Performance Engineering (ICPE 2018)},
+ date-added = {2019-12-26 15:51:19 -0800},
+ date-modified = {2020-07-01 12:46:23 -0700},
+ keywords = {papers, reproducibility, performance, testing, cross, sandia, nsf1450488},
+ month = {April 9-13},
+ title = {quiho: Automated Performance Regression Testing Using Inferred Resource Utilization Profiles},
+ year = {2018},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAvLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0ktSi9qaW1lbmV6LWljcGUxOC5wZGZPEQFwAAAAAAFwAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8SamltZW5lei1pY3BlMTgucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9wly2wAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAA0ktSgAAAgA5LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpJLUo6amltZW5lei1pY3BlMTgucGRmAAAOACYAEgBqAGkAbQBlAG4AZQB6AC0AaQBjAHAAZQAxADgALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADdVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9JLUovamltZW5lei1pY3BlMTgucGRmAAATAAEvAAAVAAIAEf//AAAACAANABoAJABWAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAco=}}
+
+@inproceedings{jimenez:rescue-hpc18,
+ abstract = {Advances in agile software delivery methodologies and tools (commonly referred to as DevOps) have not yet materialized in academic scenarios such as university, industry and government laboratories. In this position paper we make the case for Black Swan, a platform for the agile implementation, maintenance and curation of experimentation pipelines by embracing a DevOps approach.},
+ address = {Dallas, TX},
+ author = {Ivo Jimenez and Carlos Maltzahn},
+ booktitle = {1st Workshop on Reproducible, Customizable and Portable Workflows for HPC (ResCuE-HPC'18, co-located with SC'18)},
+ date-added = {2019-12-26 15:45:05 -0800},
+ date-modified = {2020-07-01 12:44:44 -0700},
+ keywords = {papers, reproducibility, cross},
+ month = {November 11},
+ title = {Spotting Black Swans With Ease: The Case for a Practical Reproducibility Platform},
+ year = {2018},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA1Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0ktSi9qaW1lbmV6LXJlc2N1ZS1ocGMxOC5wZGZPEQGIAAAAAAGIAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8YamltZW5lei1yZXNjdWUtaHBjMTgucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9oqhuQAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAA0ktSgAAAgA/LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpJLUo6amltZW5lei1yZXNjdWUtaHBjMTgucGRmAAAOADIAGABqAGkAbQBlAG4AZQB6AC0AcgBlAHMAYwB1AGUALQBoAHAAYwAxADgALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASAD1Vc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9JLUovamltZW5lei1yZXNjdWUtaHBjMTgucGRmAAATAAEvAAAVAAIAEf//AAAACAANABoAJABcAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAeg=},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA8Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0ktSi9qaW1lbmV6LXJlc2N1ZS1ocGMxOC1zbGlkZXMucGRmTxEBogAAAAABogACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////H2ppbWVuZXotcmVzY3VlLWhwYzE4LXNsaWRlcy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////aKoc5AAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAANJLUoAAAIARi86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6SS1KOmppbWVuZXotcmVzY3VlLWhwYzE4LXNsaWRlcy5wZGYADgBAAB8AagBpAG0AZQBuAGUAegAtAHIAZQBzAGMAdQBlAC0AaABwAGMAMQA4AC0AcwBsAGkAZABlAHMALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASAERVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9JLUovamltZW5lei1yZXNjdWUtaHBjMTgtc2xpZGVzLnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABjAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAgk=}}
+
+@inproceedings{liu:iodc19,
+ abstract = {The storage industry is considering new kinds of storage de- vices that support data access function offloading, i.e. the ability to perform data access functions on the storage device itself as opposed to performing it on a separate compute system to which the storage device is connected. But what is the benefit of offloading to a storage device that is controlled by an embedded platform, very different from a host platform? To quantify the benefit, we need a measurement methodology that enables apple-to-apple comparisons between different platforms. We propose a Media-based Work Unit (MBWU, pronounced ''MibeeWu''), and an MBWU-based measurement methodology to standardize the platform efficiency evaluation so as to quantify the benefit of offloading. To demonstrate the merit of this methodology, we implemented a prototype to automate quantifying the benefit of offloading the key-value data access function.},
+ address = {Frankfurt a. M., Germany},
+ author = {Jianshen Liu and Philip Kufeldt and Carlos Maltzahn},
+ booktitle = {HPC I/O in the Data Center Workshop (HPC-IODC 2019, co-located with ISC-HPC 2019)},
+ date-added = {2019-12-26 15:40:05 -0800},
+ date-modified = {2020-07-01 13:11:21 -0700},
+ keywords = {papers, reproducibility, performance, embedded, storage, eusocial, cross},
+ month = {June 20},
+ title = {MBWU: Benefit Quantification for Data Access Function Offloading},
+ year = {2019},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxApLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0wvbGl1LWlvZGMxOS5wZGZPEQFaAAAAAAFaAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8ObGl1LWlvZGMxOS5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9lvmDAAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUwAAAIAMy86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6TDpsaXUtaW9kYzE5LnBkZgAADgAeAA4AbABpAHUALQBpAG8AZABjADEAOQAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAMVVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0wvbGl1LWlvZGMxOS5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFAAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABrg==},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAwLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0wvbGl1LWlvZGMxOS1zbGlkZXMucGRmTxEBdAAAAAABdAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////FWxpdS1pb2RjMTktc2xpZGVzLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////aKoXmAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFMAAACADovOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkw6bGl1LWlvZGMxOS1zbGlkZXMucGRmAA4ALAAVAGwAaQB1AC0AaQBvAGQAYwAxADkALQBzAGwAaQBkAGUAcwAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAOFVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0wvbGl1LWlvZGMxOS1zbGlkZXMucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFcAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABzw==}}
+
+@inproceedings{dahlgren:pdsw19,
+ abstract = {In the post-Moore era, systems and devices with new architectures will arrive at a rapid rate with significant impacts on the software stack. Applications will not be able to fully benefit from new architectures unless they can delegate adapting to new devices in lower layers of the stack. In this paper we introduce physical design management which deals with the problem of identifying and executing transformations on physical designs of stored data, i.e. how data is mapped to storage abstractions like files, objects, or blocks, in order to improve performance. Physical design is traditionally placed with applications, access libraries, and databases, using hard- wired assumptions about underlying storage systems. Yet, storage systems increasingly not only contain multiple kinds of storage devices with vastly different performance profiles but also move data among those storage devices, thereby changing the benefit of a particular physical design. We advocate placing physical design management in storage, identify interesting research challenges, provide a brief description of a prototype implementation in Ceph, and discuss the results of initial experiments at scale that are replicable using Cloudlab. These experiments show performance and resource utilization trade-offs associated with choosing different physical designs and choosing to transform between physical designs.},
+ address = {Denver, CO},
+ author = {Kathryn Dahlgren and Jeff LeFevre and Ashay Shirwadkar and Ken Iizawa and Aldrin Montana and Peter Alvaro and Carlos Maltzahn},
+ booktitle = {4th International Parallel Data Systems Workshop (PDSW 2019, co-located with SC'19)},
+ date-added = {2019-12-26 15:35:44 -0800},
+ date-modified = {2020-07-01 12:44:17 -0700},
+ keywords = {papers, programmable, storage, datamanagement, physicaldesign, cross, nsf1836650, nsf1764102, nsf1705021},
+ month = {November 18},
+ title = {Towards Physical Design Management in Storage Systems},
+ year = {2019},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAuLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0QvZGFobGdyZW4tcGRzdzE5LnBkZk8RAWwAAAAAAWwAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xNkYWhsZ3Jlbi1wZHN3MTkucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2iqEdAAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABRAAAAgA4LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpEOmRhaGxncmVuLXBkc3cxOS5wZGYADgAoABMAZABhAGgAbABnAHIAZQBuAC0AcABkAHMAdwAxADkALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADZVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9EL2RhaGxncmVuLXBkc3cxOS5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQAVQAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHF}}
+
+@inproceedings{uta:nsdi20,
+ abstract = {Performance variability has been acknowledged as a problem for over a decade by cloud practitioners and performance engineers. Yet, our survey of top systems conferences reveals that the research community regularly disregards variability when running experiments in the cloud. Focusing on networks, we assess the impact of variability on cloud-based big-data workloads by gathering traces from mainstream commercial clouds and private research clouds. Our data collection consists of millions of datapoints gathered while transferring over 9 petabytes of data. We characterize the network variability present in our data and show that, even though commercial cloud providers implement mechanisms for quality-of-service enforcement, variability still occurs, and is even exacerbated by such mechanisms and service provider policies. We show how big-data workloads suffer from significant slowdowns and lack predictability and replicability, even when state-of-the-art experimentation techniques are used. We provide guidelines for practitioners to reduce the volatility of big data performance, making experiments more repeatable.},
+ address = {Santa Clara, CA},
+ author = {Alexandru Uta and Alexandru Custura and Dmitry Duplyakin and Ivo Jimenez and Jan Rellermeyer and Carlos Maltzahn and Robert Ricci and Alexandru Iosup},
+ booktitle = {NSDI '20},
+ date-added = {2019-12-26 15:33:24 -0800},
+ date-modified = {2020-07-01 12:48:02 -0700},
+ keywords = {papers, reproducibility, datacenter, performance, cross, nsf1450488, nsf1705021, nsf1764102, nsf1836650},
+ month = {February 25-27},
+ title = {Is Big Data Performance Reproducible in Modern Cloud Networks?},
+ year = {2020},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxArLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1UtVi91dGEtbnNkaTIwLnBkZk8RAWAAAAAAAWAAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////w51dGEtbnNkaTIwLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2mgzfwAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAADVS1WAAACADUvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOlUtVjp1dGEtbnNkaTIwLnBkZgAADgAeAA4AdQB0AGEALQBuAHMAZABpADIAMAAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAM1VzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL1UtVi91dGEtbnNkaTIwLnBkZgAAEwABLwAAFQACABH//wAAAAgADQAaACQAUgAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAG2}}
+
+@inproceedings{lefevre:vault19,
+ abstract = {Ceph is an open source distributed storage system that is object-based and massively scalable. Ceph provides developers with the capability to create data interfaces that can take advantage of local CPU and memory on the storage nodes (Ceph Object Storage Devices). These interfaces are powerful for application developers and can be created in C, C++, and Lua.
+
+Skyhook is an open source storage and database project in the Center for Research in Open Source Software at UC Santa Cruz. Skyhook uses these capabilities in Ceph to create specialized read/write interfaces that leverage IO and CPU within the storage layer toward database processing and management. Specifically, we develop methods to apply predicates locally as well as additional metadata and indexing capabilities using Ceph's internal indexing mechanism built on top of RocksDB.
+
+Skyhook's approach helps to enable scale-out of a single node database system by scaling out the storage layer. Our results show the performance benefits for some queries indeed scale well as the storage layer scales out.},
+ address = {Boston, MA},
+ author = {Jeff LeFevre and Noah Watkins and Michael Sevilla and Carlos Maltzahn},
+ booktitle = {2019 Linux Storage and Filesystems (Vault'19, co-located with FAST'19)},
+ date-added = {2019-08-07 17:58:01 -0700},
+ date-modified = {2020-07-01 12:49:10 -0700},
+ keywords = {papers, programmable, storage, database, cross, nsf1705021, nsf1764102, nsf1836650},
+ month = {Februrary 25-26},
+ title = {Skyhook: Programmable storage for databases},
+ year = {2019},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA1Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0wvbGVmZXZyZS12YXVsdDE5LXNsaWRlcy5wZGZPEQGKAAAAAAGKAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8abGVmZXZyZS12YXVsdDE5LXNsaWRlcy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9nVvz8AAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUwAAAIAPy86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6TDpsZWZldnJlLXZhdWx0MTktc2xpZGVzLnBkZgAADgA2ABoAbABlAGYAZQB2AHIAZQAtAHYAYQB1AGwAdAAxADkALQBzAGwAaQBkAGUAcwAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAPVVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0wvbGVmZXZyZS12YXVsdDE5LXNsaWRlcy5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFwAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAAB6g==}}
+
+@inproceedings{david:precs19,
+ abstract = {Computer network research experiments can be broadly grouped in three categories: simulated, controlled, and real-world experiments. Simulation frameworks, experiment testbeds and measurement tools, respectively, are commonly used as the platforms for carrying out network experiments. In many cases, given the nature of computer network experiments, properly configuring these platforms is a complex and time-consuming task, which makes replicating and validating research results quite challenging. This complexity can be reduced by leveraging tools that enable experiment reproducibility. In this paper, we show how a recently proposed reproducibility tool called Popper facilitates the reproduction of networking exper- iments. In particular, we detail the steps taken to reproduce results in two published articles that rely on simulations. The outcome of this exercise is a generic workflow for carrying out network simulation experiments. In addition, we briefly present two additional Popper workflows for running experiments on controlled testbeds, as well as studies that gather real-world metrics (all code is publicly available on Github). We close by providing a list of lessons we learned throughout this process.},
+ author = {Andrea David and Mariette Souppe and Ivo Jimenez and Katia Obraczka and Sam Mansfield and Kerry Veenstra and Carlos Maltzahn},
+ booktitle = {P-RECS'19},
+ date-added = {2019-06-25 11:22:58 -0700},
+ date-modified = {2020-07-01 12:50:12 -0700},
+ keywords = {papers, reproducibility, networking, experience, cross, nsf1450488, nsf1836650},
+ month = {June 24},
+ title = {Reproducible Computer Network Experiments: A Case Study Using Popper},
+ year = {2019},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAsLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0QvZGF2aWQtcHJlY3MxOS5wZGZPEQFkAAAAAAFkAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8RZGF2aWQtcHJlY3MxOS5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9lvlSUAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUQAAAIANi86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6RDpkYXZpZC1wcmVjczE5LnBkZgAOACQAEQBkAGEAdgBpAGQALQBwAHIAZQBjAHMAMQA5AC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA0VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvRC9kYXZpZC1wcmVjczE5LnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABTAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbs=}}
+
+@unpublished{liu:ocpgs19,
+ author = {Jianshen Liu and Philip Kufeldt and Carlos Maltzahn},
+ date-added = {2019-05-06 18:39:54 -0700},
+ date-modified = {2020-07-01 12:51:05 -0700},
+ keywords = {shortpapers, eusocial, storagemedium, performance, cross},
+ month = {March 14-15},
+ note = {Poster at OCP Global Summit 2019},
+ title = {Quantifying benefits of offloading data management to storage devices},
+ year = {2019},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAxLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0wvbGl1LW9jcGdzMTktcG9zdGVyLnBkZk8RAXoAAAAAAXoAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xZsaXUtb2NwZ3MxOS1wb3N0ZXIucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2PYw6AAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABTAAAAgA7LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpMOmxpdS1vY3BnczE5LXBvc3Rlci5wZGYAAA4ALgAWAGwAaQB1AC0AbwBjAHAAZwBzADEAOQAtAHAAbwBzAHQAZQByAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA5VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvTC9saXUtb2NwZ3MxOS1wb3N0ZXIucGRmAAATAAEvAAAVAAIAEf//AAAACAANABoAJABYAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAdY=}}
+
+@inproceedings{sevilla:hotstorage18,
+ abstract = {The file system metadata service is the scalability bottleneck for many of today's workloads. Common approaches for attacking this ``metadata scaling wall'' include: caching inodes on clients and servers, caching parent inodes for path traversal, and dynamic caching policies that exploit workload locality. These caches reduce the number of remote procedure calls (RPCs) but the effectiveness is dependent on the overhead of maintaining cache coherence and the administrator's ability to select the best cache size for the given workloads. Recent work reduces the number of metadata RPCs to 1 without using a cache at all, by letting clients ``decouple'' the subtrees from the global namespace so that they can do metadata operations locally. Even with this technique, we show that file system metadata is still a bottleneck because namespaces for today's workloads can be very large. The size is problematic for reads because metadata needs to be transferred and materialized.
+
+The management techniques for file system metadata assume that namespaces have no structure but we observe that this is not the case for all workloads. We propose Tintenfisch, a file system that allows users to succinctly express the structure of the metadata they intend to create. If a user can express the structure of the namespace, Tintenfisch clients and servers can (1) compact metadata, (2) modify large namespaces more quickly, and (3) generate only relevant parts of the namespace. This reduces network traffic, storage footprints, and the number of overall metadata operations needed to complete a job.},
+ address = {Boston, MA},
+ annote = {Submitted to HotStorage'18},
+ author = {Michael A. Sevilla and Reza Nasirigerdeh and Carlos Maltzahn and Jeff LeFevre and Noah Watkins and Peter Alvaro and Margaret Lawson and Jay Lofstead and Jim Pivarski},
+ booktitle = {HotStorage '18},
+ date-added = {2018-09-04 00:39:56 -0700},
+ date-modified = {2020-07-01 12:53:25 -0700},
+ keywords = {papers, metadata, filesystems, scalable, naming, cross, doeDE-SC0016074, nsf1450488, nsf1705021},
+ month = {July 9-10},
+ title = {Tintenfisch: File System Namespace Schemas and Generators},
+ year = {2018},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAzLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1Mvc2V2aWxsYS1ob3RzdG9yYWdlMTgucGRmTxEBggAAAAABggACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////GHNldmlsbGEtaG90c3RvcmFnZTE4LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////Xs4gIAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFTAAACAD0vOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOlM6c2V2aWxsYS1ob3RzdG9yYWdlMTgucGRmAAAOADIAGABzAGUAdgBpAGwAbABhAC0AaABvAHQAcwB0AG8AcgBhAGcAZQAxADgALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADtVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9TL3NldmlsbGEtaG90c3RvcmFnZTE4LnBkZgAAEwABLwAAFQACABH//wAAAAgADQAaACQAWgAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHg}}
+
+@inproceedings{maricq:osdi18,
+ abstract = {The performance of compute hardware varies: software run repeatedly on the same server (or a different server with supposedly identical parts) can produce performance results that differ with each execution. This variation has important effects on the reproducibility of systems research and ability to quantitatively compare the performance of different systems. It also has implications for commercial computing, where agreements are often made conditioned on meeting specific performance targets.
+Over a period of 10 months, we conducted a large-scale study capturing nearly 900,000 data points from 835 servers. We examine this data from two perspectives: that of a service provider wishing to offer a consistent environment, and that of a systems researcher who must understand how variability impacts experimental results. From this examination, we draw a number of lessons about the types and magnitudes of performance variability and the effects on confidence in experiment results. We also create a statistical model that can be used to understand how representative an individual server is of the general population. The full dataset and our analysis tools are publicly available, and we have built a system to interactively explore the data and make recommendations for experiment parameters based on statistical analysis of historical data.},
+ address = {Carlsbad, CA},
+ author = {Aleksander Maricq and Dmitry Duplyakin and Ivo Jimenez and Carlos Maltzahn and Ryan Stutsman and Robert Ricci},
+ booktitle = {13th USENIX Symposium on Operating Systems Design and Implementation (OSDI'18)},
+ date-added = {2018-07-21 02:10:24 +0000},
+ date-modified = {2020-07-01 12:54:52 -0700},
+ keywords = {papers, performance, statistics, cloud, reproducibility, systems, nsf1450488, cross},
+ month = {October 8-10},
+ title = {Taming Performance Variability},
+ year = {2018},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAsLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL00vbWFyaWNxLW9zZGkxOC5wZGZPEQFkAAAAAAFkAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8RbWFyaWNxLW9zZGkxOC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9fT1NAAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAU0AAAIANi86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6TTptYXJpY3Etb3NkaTE4LnBkZgAOACQAEQBtAGEAcgBpAGMAcQAtAG8AcwBkAGkAMQA4AC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA0VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvTS9tYXJpY3Etb3NkaTE4LnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABTAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbs=}}
+
+@inproceedings{sevilla:ccgrid18,
+ abstract = {Our analysis of the key-value activity generated by the ParSplice molecular dynamics simulation demonstrates the need for more complex cache management strategies. Baseline measurements show clear key access patterns and hot spots that offer significant opportunity for optimization. We use the data management language and policy engine from the Mantle system to dynamically explore a variety of techniques, ranging from basic algorithms and heuristics to statistical models, calculus, and machine learning. While Mantle was originally designed for distributed file systems, we show how the collection of abstractions effectively decomposes the problem into manageable policies for a different application and storage system. Our exploration of this space results in a dynamically sized cache policy that does not sacrifice any performance while using 32-66% less memory than the default ParSplice configuration.},
+ address = {Washington, DC},
+ author = {Michael A. Sevilla and Carlos Maltzahn and Peter Alvaro and Reza Nasirigerdeh and Bradley W. Settlemyer and Danny Perez and David Rich and Galen M. Shipman},
+ booktitle = {CCGRID '18},
+ date-added = {2018-07-01 21:56:37 +0000},
+ date-modified = {2020-07-01 12:57:24 -0700},
+ keywords = {papers, caching, programmable, storage, hpc, doeDE-SC0016074, cross},
+ month = {May 1-4},
+ title = {Programmable Caches with a Data Management Language \& Policy Engine},
+ year = {2018},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAvLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1Mvc2V2aWxsYS1jY2dyaWQxOC5wZGZPEQFyAAAAAAFyAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8Uc2V2aWxsYS1jY2dyaWQxOC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9ezkIQAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAVMAAAIAOS86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6UzpzZXZpbGxhLWNjZ3JpZDE4LnBkZgAADgAqABQAcwBlAHYAaQBsAGwAYQAtAGMAYwBnAHIAaQBkADEAOAAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAN1VzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL1Mvc2V2aWxsYS1jY2dyaWQxOC5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFYAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABzA==}}
+
+@inproceedings{sevilla:precs18,
+ abstract = {We describe the four publications we have tried to make reproducible and discuss how each paper has changed our workflows, practices, and collaboration policies. The fundamental insight is that paper artifacts must be made reproducible from the start of the project; artifacts are too difficult to make reproducible when the papers are (1) already published and (2) authored by researchers that are not thinking about reproducibility. In this paper, we present the best practices adopted by our research laboratory, which was sculpted by the pitfalls we have identified for the Popper convention. We conclude with a ``call-to-arms" for the community focused on enhancing reproducibility initiatives for academic conferences, industry environments, and national laboratories. We hope that our experiences will shape a best practices guide for future reproducible papers.},
+ address = {Tempe, AZ},
+ author = {Michael A. Sevilla and Carlos Maltzahn},
+ booktitle = {P-RECS'18},
+ date-added = {2018-06-12 17:20:57 +0000},
+ date-modified = {2020-07-01 12:57:49 -0700},
+ keywords = {papers, reproducibility, experience, cross, nsf1450488},
+ month = {June 11},
+ title = {Popper Pitfalls: Experiences Following a Reproducibility Convention},
+ year = {2018},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAuLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1Mvc2V2aWxsYS1wcmVjczE4LnBkZk8RAWwAAAAAAWwAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xNzZXZpbGxhLXByZWNzMTgucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////10VPrQAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABUwAAAgA4LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpTOnNldmlsbGEtcHJlY3MxOC5wZGYADgAoABMAcwBlAHYAaQBsAGwAYQAtAHAAcgBlAGMAcwAxADgALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADZVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9TL3NldmlsbGEtcHJlY3MxOC5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQAVQAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHF}}
+
+@article{kufeldt:login18,
+ abstract = {As storage devices get faster, data management tasks rob the host of CPU cycles and DDR bandwidth. In this article, we examine a new interface to storage devices that can leverage existing and new CPU and DRAM resources to take over data management tasks like availability, recovery, and migrations. This new interface provides a roadmap for device-to-device interactions and more powerful storage devices capable of providing in-store compute services that can dramatically improve performance. We call such storage devices ``eusocial'' because we are inspired by eusocial insects like ants, termites, and bees, which as individuals are primitive but collectively accomplish amazing things.
+},
+ author = {Philip Kufeldt and Carlos Maltzahn and Tim Feldman and Christine Green and Grant Mackey and Shingo Tanaka},
+ date-added = {2018-06-06 16:06:14 +0000},
+ date-modified = {2020-07-01 12:58:56 -0700},
+ journal = {;login: The USENIX Magazine},
+ keywords = {papers, storage, devices, networking, flash, offloading, cross},
+ number = {2},
+ pages = {16--22},
+ title = {Eusocial Storage Devices - Offloading Data Management to Storage Devices that Can Act Collectively},
+ volume = {43},
+ year = {2018},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAuLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0sva3VmZWxkdC1sb2dpbjE4LnBkZk8RAWwAAAAAAWwAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xNrdWZlbGR0LWxvZ2luMTgucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////13fyGAAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABSwAAAgA4LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpLOmt1ZmVsZHQtbG9naW4xOC5wZGYADgAoABMAawB1AGYAZQBsAGQAdAAtAGwAbwBnAGkAbgAxADgALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADZVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9LL2t1ZmVsZHQtbG9naW4xOC5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQAVQAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHF}}
+
+@inproceedings{jimenez:pdsw15,
+ abstract = {Validating experimental results in the field of storage systems is a challenging task, mainly due to the many changes in software and hardware that computational environments go through. Determining if an experiment is reproducible entails two separate tasks: re-executing the experiment and validating the results. Existing reproducibility efforts have focused on the former, envisioning techniques and infrastructures that make it easier to re-execute an experiment. In this position paper, we focus on the latter by analyzing the validation workflow that an experiment re-executioner goes through. We notice that validating results is done on the basis of experiment design and high-level goals, rather than exact quantitative metrics. Based on this insight, we introduce a declarative format for specifying the high-level components of an experiment as well as describing generic, testable conditions that serve as the basis for validation. We present a use case in the area of distributed storage systems to illustrate the usefulness of this approach.},
+ address = {Austin, TX},
+ author = {Ivo Jimenez and Carlos Maltzahn and Jay Lofstead and Kathryn Mohror and Adam Moody and Remzi Arpaci-Dusseau and Andrea Arpaci-Dusseau},
+ booktitle = {PDSW'15},
+ date-added = {2018-05-15 06:28:35 +0000},
+ date-modified = {2020-01-04 23:42:08 -0700},
+ keywords = {papers, reproducibility, declarative},
+ month = {November 15},
+ title = {Tackling the Reproducibility Problem in Storage Systems Research with Declarative Experiment Specifications},
+ year = {2015},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAvLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0ktSi9qaW1lbmV6LXBkc3cxNS5wZGZPEQFwAAAAAAFwAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8SamltZW5lei1wZHN3MTUucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9cfy+sAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAA0ktSgAAAgA5LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpJLUo6amltZW5lei1wZHN3MTUucGRmAAAOACYAEgBqAGkAbQBlAG4AZQB6AC0AcABkAHMAdwAxADUALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADdVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9JLUovamltZW5lei1wZHN3MTUucGRmAAATAAEvAAAVAAIAEf//AAAACAANABoAJABWAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAco=}}
+
+@techreport{sevilla:ucsctr18,
+ address = {Santa Cruz, CA},
+ annote = {Submitted to HotStorage'18},
+ author = {Michael A. Sevilla and Reza Nasirigerdeh and Carlos Maltzahn and Jeff LeFevre and Noah Watkins and Peter Alvaro and Margaret Lawson and Jay Lofstead and Jim Pivarski},
+ date-added = {2018-04-08 04:09:23 +0000},
+ date-modified = {2018-04-08 04:13:07 +0000},
+ institution = {UC Santa Cruz},
+ keywords = {papers, metadata, filesystems, scalable, naming},
+ month = {April 7},
+ number = {UCSC-SOE-18-08},
+ title = {Tintenfisch: File System Namespace Schemas and Generators},
+ type = {Tech. rept.},
+ year = {2018},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAvLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1Mvc2V2aWxsYS11Y3NjdHIxOC5wZGZPEQFyAAAAAAFyAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8Uc2V2aWxsYS11Y3NjdHIxOC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9bu4/kAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAVMAAAIAOS86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6UzpzZXZpbGxhLXVjc2N0cjE4LnBkZgAADgAqABQAcwBlAHYAaQBsAGwAYQAtAHUAYwBzAGMAdAByADEAOAAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAN1VzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL1Mvc2V2aWxsYS11Y3NjdHIxOC5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFYAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABzA==}}
+
+@inproceedings{jia:hipc17,
+ abstract = {Accessing external resources (e.g., loading input data, checkpointing snapshots, and out-of-core processing) can have a significant impact on the performance of applications. However, no existing programming systems for high-performance computing directly manage and optimize external accesses. As a result, users must explicitly manage external accesses alongside their computation at the application level, which can result in both correctness and performance issues.
+We address this limitation by introducing Iris, a task-based programming model with semantics for external resources. Iris allows applications to describe their access requirements to external resources and the relationship of those accesses to the computation. Iris incorporates external I/O into a deferred execution model, reschedules external I/O to overlap I/O with computation, and reduces external I/O when possible. We evaluate Iris on three microbenchmarks representative of important workloads in HPC and a full combustion simulation, S3D. We demonstrate that the Iris implementation of S3D reduces the external I/O overhead by up to 20x, compared to the Legion and the Fortran implementations.},
+ address = {Jaipur, India},
+ author = {Zhihao Jia and Sean Treichler and Galen Shipman and Michael Bauer and Noah Watkins and Carlos Maltzahn and Pat McCormick and Alex Aiken},
+ booktitle = {HiPC 2017},
+ date-added = {2018-04-03 18:26:23 +0000},
+ date-modified = {2020-07-01 12:59:49 -0700},
+ keywords = {papers, runtime, distributed, programming, storage, cross, doeDE-SC0016074, nsf1450488},
+ month = {December 18-21},
+ title = {Integrating External Resources with a Task-Based Programming Model},
+ year = {2017},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxArLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0ktSi9qaWEtaGlwYzE3LnBkZk8RAWAAAAAAAWAAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////w5qaWEtaGlwYzE3LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////17ONigAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAADSS1KAAACADUvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkktSjpqaWEtaGlwYzE3LnBkZgAADgAeAA4AagBpAGEALQBoAGkAcABjADEANwAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAM1VzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0ktSi9qaWEtaGlwYzE3LnBkZgAAEwABLwAAFQACABH//wAAAAgADQAaACQAUgAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAG2}}
+
+@inproceedings{sevilla:ipdps18,
+ abstract = {HPC and data center scale application developers are abandoning POSIX IO because file system metadata synchronization and serialization overheads of providing strong consistency and durability are too costly -- and often unnecessary -- for their applications. Unfortunately, designing file systems with weaker consistency or durability semantics excludes applications that rely on stronger guarantees, forcing developers to re-write their applications or deploy them on a different system. We present a framework and API that lets administrators specify their consistency/durability requirements and dynamically assign them to subtrees in the same namespace, allowing administrators to optimize subtrees over time and space for different workloads. We show similar speedups to related work but more importantly, we show performance improvements when we custom fit subtree semantics to applications such as checkpoint-restart (91.7x speedup), user home directories (0.03 standard deviation from optimal), and users checking for partial results (2\% overhead).},
+ address = {Vancouver, BC, Canada},
+ author = {Michael A. Sevilla and Ivo Jimenez and Noah Watkins and Jeff LeFevre and Peter Alvaro and Shel Finkelstein and Patrick Donnelly and Carlos Maltzahn},
+ booktitle = {IPDPS 2018},
+ date-added = {2018-03-19 21:24:16 +0000},
+ date-modified = {2020-07-01 13:03:23 -0700},
+ keywords = {papers, metadata, datamanagement, programmable, filesystems, storage, systems, cross, nsf1450488, doeDE-SC0016074},
+ month = {May 21-25},
+ title = {Cudele: An API and Framework for Programmable Consistency and Durability in a Global Namespace},
+ year = {2018},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAuLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1Mvc2V2aWxsYS1pcGRwczE4LnBkZk8RAWwAAAAAAWwAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xNzZXZpbGxhLWlwZHBzMTgucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////17OPNgAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABUwAAAgA4LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpTOnNldmlsbGEtaXBkcHMxOC5wZGYADgAoABMAcwBlAHYAaQBsAGwAYQAtAGkAcABkAHAAcwAxADgALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADZVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9TL3NldmlsbGEtaXBkcHMxOC5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQAVQAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHF}}
+
+@inproceedings{ionkov-pdsw17,
+ abstract = {Scientific workflows contain an increasing number of interacting applications, often with big disparity between the formats of data being produced and consumed by different applications. This mismatch can result in performance degradation as data retrieval causes multiple read operations (often to a remote storage system) in order to convert the data. Although some parallel filesystems and middleware libraries attempt to identify access patterns and optimize data retrieval, they frequently fail if the patterns are complex.
+The goal of ASGARD is to replace I/O operations issued to a file by the processes with a single operation that passes enough semantic information to the storage system, so it can combine (and eventually optimize) the data movement. ASGARD allows application developers to define their application's abstract dataset as well as the subsets of the data (fragments) that are created and used by the HPC codes. It uses the semantic information to generate and execute transformation rules that convert the data between the the memory layouts of the producer and consumer applications, as well as the layout on nonvolatile storage. The transformation engine implements functionality similar to the scatter/gather support available in some file systems. Since data subsets are defined during the initialization phase, i.e., well in advance from the time they are used to store and retrieve data, the storage system has multiple opportunities to optimize both the data layout and the transformation rules in order to increase the overall I/O performance.
+In order to evaluate ASGARD's performance, we added support for ASGARD's transformation rules to Ceph's object store RADOS. We created Ceph data objects that allow custom data striping based on ASGARD's fragment definitions. Our tests with the extended RADOS show up to 5 times performance improvements for writes and 10 times performance improvements for reads over collective MPI I/O.},
+ address = {Denver, CO},
+ author = {Latchesar Ionkov and Carlos Maltzahn and Michael Lang},
+ booktitle = {PDSW-DISCS 2017 at SC17},
+ date-added = {2017-11-07 16:45:07 +0000},
+ date-modified = {2020-01-04 21:39:53 -0700},
+ keywords = {papers, replication, layout, language},
+ month = {Nov 13},
+ title = {Optimized Scatter/Gather Data Operations for Parallel Storage},
+ year = {2017},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAuLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0ktSi9pb25rb3YtcGRzdzE3LnBkZk8RAWoAAAAAAWoAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xFpb25rb3YtcGRzdzE3LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////17OCgwAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAADSS1KAAACADgvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkktSjppb25rb3YtcGRzdzE3LnBkZgAOACQAEQBpAG8AbgBrAG8AdgAtAHAAZABzAHcAMQA3AC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA2VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvSS1KL2lvbmtvdi1wZHN3MTcucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFUAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABww==},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA1Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0ktSi9pb25rb3YtcGRzdzE3LXNsaWRlcy5wZGZPEQGIAAAAAAGIAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8YaW9ua292LXBkc3cxNy1zbGlkZXMucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9bLVjQAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAA0ktSgAAAgA/LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpJLUo6aW9ua292LXBkc3cxNy1zbGlkZXMucGRmAAAOADIAGABpAG8AbgBrAG8AdgAtAHAAZABzAHcAMQA3AC0AcwBsAGkAZABlAHMALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASAD1Vc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9JLUovaW9ua292LXBkc3cxNy1zbGlkZXMucGRmAAATAAEvAAAVAAIAEf//AAAACAANABoAJABcAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAeg=}}
+
+@article{hacker:bams17,
+ abstract = {Software containers can revolutionize research and education with numerical weather prediction models by easing use and guaranteeing reproducibility.},
+ author = {Joshua P. Hacker and John Exby and David Gill and Ivo Jimenez and Carlos Maltzahn and Timothy See and Gretchen Mullendore and Kathryn Fossell},
+ date-added = {2017-08-29 05:50:47 +0000},
+ date-modified = {2020-01-04 21:40:58 -0700},
+ journal = {Bull. Amer. Meteor. Soc.},
+ keywords = {papers, containers, nwp, learning},
+ pages = {1129--1138},
+ title = {A Containerized Mesoscale Model and Analysis Toolkit to Accelerate Classroom Learning, Collaborative Research, and Uncertainty Quantification},
+ volume = {98},
+ year = {2017},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAsLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0gvaGFja2VyLWJhbXMxNy5wZGZPEQFkAAAAAAFkAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8RaGFja2VyLWJhbXMxNy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9XKT/kAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUgAAAIANi86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6SDpoYWNrZXItYmFtczE3LnBkZgAOACQAEQBoAGEAYwBrAGUAcgAtAGIAYQBtAHMAMQA3AC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA0VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvSC9oYWNrZXItYmFtczE3LnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABTAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbs=}}
+
+@inproceedings{jimenez:cnert17,
+ abstract = {This paper introduces PopperCI, a continous integration (CI) service hosted at UC Santa Cruz that allows researchers to automate the end-to-end execution and validation of experiments. PopperCI assumes that experiments follow Popper, a convention for implementing experiments and writing articles following a DevOps approach that has been proposed recently. PopperCI runs experiments on public, private or government-fundend cloud infrastructures in a fully automated way. We describe how PopperCI executes experiments and present a use case that illustrates the usefulness of the service.},
+ address = {Atlanta, GA},
+ author = {Ivo Jimenez and Andrea Arpaci-Dusseau and Remzi Arpaci-Dusseau and Jay Lofstead and Carlos Maltzahn and Kathryn Mohror and Robert Ricci},
+ booktitle = {Workshop on Computer and Networking Experimental Research Using Testbeds (CNERT'17) in conjunction with IEEE INFOCOM 2017},
+ date-added = {2017-07-31 03:37:33 +0000},
+ date-modified = {2020-01-04 21:41:20 -0700},
+ keywords = {papers, reproducibility, devops},
+ month = {May 1},
+ title = {PopperCI: Automated Reproducibility Validation},
+ year = {2017},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAwLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0ktSi9qaW1lbmV6LWNuZXJ0MTcucGRmTxEBcgAAAAABcgACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////E2ppbWVuZXotY25lcnQxNy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////Vo/T7AAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAANJLUoAAAIAOi86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6SS1KOmppbWVuZXotY25lcnQxNy5wZGYADgAoABMAagBpAG0AZQBuAGUAegAtAGMAbgBlAHIAdAAxADcALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADhVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9JLUovamltZW5lei1jbmVydDE3LnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABXAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAc0=}}
+
+@inproceedings{jimenez:reppar17,
+ abstract = {Independent validation of experimental results in the field of systems research is a challenging task, mainly due to differences in software and hardware in computational environments. Recreating an environment that resembles the original is difficult and time-consuming. In this paper we introduce Popper, a convention based on a set of modern open source software (OSS) development principles for generating reproducible scientific publications. Concretely, we make the case for treating an article as an OSS project following a DevOps approach and applying software engineering best-practices to manage its associated artifacts and maintain the reproducibility of its findings. Popper leverages existing cloud-computing infrastructure and DevOps tools to produce academic articles that are easy to validate and extend. We present a use case that illustrates the usefulness of this approach. We show how, by following the Popper convention, reviewers and researchers can quickly get to the point of getting results without relying on the original author's intervention.
+},
+ address = {Orlando, FL},
+ author = {Ivo Jimenez and Michael Sevilla and Noah Watkins and Carlos Maltzahn and Jay Lofstead and Kathryn Mohror and Andrea Arpac-Dusseau and Remzi Arpaci-Dusseau},
+ booktitle = {4th International Workshop on Reproducibility in Parallel Computing (REPPAR) in conjunction with IPDPS 2017},
+ date-added = {2017-07-31 03:27:58 +0000},
+ date-modified = {2020-01-04 21:41:54 -0700},
+ keywords = {papers, reproducibility, devops},
+ month = {June 2},
+ title = {The Popper Convention: Making Reproducible Systems Evaluation Practical},
+ year = {2017},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAxLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0ktSi9qaW1lbmV6LXJlcHBhcjE3LnBkZk8RAXgAAAAAAXgAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xRqaW1lbmV6LXJlcHBhcjE3LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////1aPzrwAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAADSS1KAAACADsvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkktSjpqaW1lbmV6LXJlcHBhcjE3LnBkZgAADgAqABQAagBpAG0AZQBuAGUAegAtAHIAZQBwAHAAYQByADEANwAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAOVVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0ktSi9qaW1lbmV6LXJlcHBhcjE3LnBkZgAAEwABLwAAFQACABH//wAAAAgADQAaACQAWAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHU}}
+
+@inproceedings{watkins:hotstorage17,
+ abstract = {Popular storage systems support diverse storage abstractions by providing important disaggregation benefits. Instead of maintaining a separate system for each abstraction, unified storage systems, in particular, support standard file, block, and object abstractions so the same hardware can be used for a wider range and a more flexible mix of applications. As large-scale unified storage systems continue to evolve to meet the requirements of an increasingly diverse set of applications and next-generation hardware, de jure approaches of the past---based on standardized interfaces---are giving way to domain-specific interfaces and optimizations. While promising, the ad-hoc strategies characteristic of current approaches to co-design are untenable.
+The standardization of the POSIX I/O interface has been a major success. General adoption has allowed application developers to avoid vendor lock-in and encourages storage system designers to innovate independently. However, large-scale storage systems are generally dominated by proprietary offerings, preventing exploration of alternative interfaces when the need has presented itself. An increase in the number of special-purpose storage systems characterizes recent history in the field, including the emergence of high-performance, and highly modifiable, open-source storage systems, which enable system changes without fear of vendor lock-in. Unfortunately, evolving storage system interfaces is a challenging task requiring domain expertise, and is predicated on the willingness of programmers to forfeit the protection from change afforded by narrow interfaces.},
+ address = {Santa Clara, CA},
+ author = {Noah Watkins and Michael A. Sevilla and Ivo Jimenez and Kathryn Dahlgren and Peter Alvaro and Shel Finkelstein and Carlos Maltzahn},
+ booktitle = {HotStorage '17},
+ date-added = {2017-05-20 22:54:48 +0000},
+ date-modified = {2020-01-19 15:33:14 -0800},
+ keywords = {papers, storage, systems, declarative, distributed},
+ month = {July 10-11},
+ title = {DeclStore: Layering is for the Faint of Heart},
+ year = {2017},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAzLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1cvd2F0a2lucy1ob3RzdG9yYWdlMTcucGRmTxEBggAAAAABggACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////GHdhdGtpbnMtaG90c3RvcmFnZTE3LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////aTMzuAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFXAAACAD0vOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOlc6d2F0a2lucy1ob3RzdG9yYWdlMTcucGRmAAAOADIAGAB3AGEAdABrAGkAbgBzAC0AaABvAHQAcwB0AG8AcgBhAGcAZQAxADcALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADtVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9XL3dhdGtpbnMtaG90c3RvcmFnZTE3LnBkZgAAEwABLwAAFQACABH//wAAAAgADQAaACQAWgAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHg},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA6Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1cvd2F0a2lucy1ob3RzdG9yYWdlMTctc2xpZGVzLnBkZk8RAZwAAAAAAZwAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////x93YXRraW5zLWhvdHN0b3JhZ2UxNy1zbGlkZXMucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2konzwAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABVwAAAgBELzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpXOndhdGtpbnMtaG90c3RvcmFnZTE3LXNsaWRlcy5wZGYADgBAAB8AdwBhAHQAawBpAG4AcwAtAGgAbwB0AHMAdABvAHIAYQBnAGUAMQA3AC0AcwBsAGkAZABlAHMALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASAEJVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9XL3dhdGtpbnMtaG90c3RvcmFnZTE3LXNsaWRlcy5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQAYQAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAIB}}
+
+@inproceedings{sevilla:eurosys17,
+ abstract = {Storage systems need to support high-performance for special-purpose data processing applications that run on an evolving storage device technology landscape. This puts tremendous pressure on storage systems to support rapid change both in terms of their interfaces and their performance. But adapting storage systems can be difficult because unprincipled changes might jeopardize years of code-hardening and performance optimization efforts that were necessary for users to entrust their data to the storage system. We introduce the programmable storage approach, which exposes internal services and abstractions of the storage stack as building blocks for higher-level services. We also build a prototype to explore how existing abstractions of common storage system services can be leveraged to adapt to the needs of new data processing systems and the increasing variety of storage devices. We illustrate the advantages and challenges of this approach by composing existing internal abstractions into two new higher-level services: a file system metadata load balancer and a high-performance distributed shared-log. The evaluation demonstrates that our services inherit desirable qualities of the back-end storage system, including the ability to balance load, efficiently propagate service metadata, recover from failure, and navigate trade-offs between latency and throughput using leases.},
+ address = {Belgrade, Serbia},
+ author = {Michael A. Sevilla and Noah Watkins and Ivo Jimenez and Peter Alvaro and Shel Finkelstein and Jeff LeFevre and Carlos Maltzahn},
+ booktitle = {EuroSys '17},
+ date-added = {2017-03-14 22:06:29 +0000},
+ date-modified = {2020-01-04 21:42:47 -0700},
+ keywords = {papers, storage, systems, programmable, abstraction},
+ month = {April 23-26},
+ title = {Malacology: A Programmable Storage System},
+ year = {2017},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAwLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1Mvc2V2aWxsYS1ldXJvc3lzMTcucGRmTxEBdAAAAAABdAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////FXNldmlsbGEtZXVyb3N5czE3LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////U7zEsAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFTAAACADovOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOlM6c2V2aWxsYS1ldXJvc3lzMTcucGRmAA4ALAAVAHMAZQB2AGkAbABsAGEALQBlAHUAcgBvAHMAeQBzADEANwAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAOFVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL1Mvc2V2aWxsYS1ldXJvc3lzMTcucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFcAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABzw==},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA3Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1Mvc2V2aWxsYS1ldXJvc3lzMTctc2xpZGVzLnBkZk8RAZIAAAAAAZIAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xxzZXZpbGxhLWV1cm9zeXMxNy1zbGlkZXMucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////1SuwzAAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABUwAAAgBBLzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpTOnNldmlsbGEtZXVyb3N5czE3LXNsaWRlcy5wZGYAAA4AOgAcAHMAZQB2AGkAbABsAGEALQBlAHUAcgBvAHMAeQBzADEANwAtAHMAbABpAGQAZQBzAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA/VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvUy9zZXZpbGxhLWV1cm9zeXMxNy1zbGlkZXMucGRmAAATAAEvAAAVAAIAEf//AAAACAANABoAJABeAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAfQ=}}
+
+@inproceedings{shewmaker:icccn16,
+ abstract = {No one likes waiting in traffic, whether on a road or on a computer network. Stuttering audio, slow interactive feedback, and untimely pauses in video annoy everyone and cost businesses sales and productivity. An ideal network should (1) minimize latency, (2) maximize bandwidth, (3) share resources according to a desired policy, (4) enable incremental deployment, and (5) minimize administrative overhead. Many technologies have been developed, but none yet satisfactorily address all five goals. The best performing solutions developed so far require controlled environments where coordinated modification of multiple components in the network is possible, but they suffer poor performance in more complex scenarios.
+We present TCP Inigo, which uses independent delay-based algorithms on the sender and receiver (i.e. ambidextrously) to satisfy all five goals. In networks with single administrative domains, like those in data centers, Inigo's fairness, bandwidth, and latency indices are up to 1.3x better than the best deployable solution. When deployed in a more complex environment, such as across administrative domains, Inigo possesses latency distribution tail up to 42x better.},
+ address = {Waikoloa, HI},
+ author = {Andrew G. Shewmaker and Carlos Maltzahn and Katia Obraczka and Scott Brandt and John Bent},
+ booktitle = {25th International Conference on Computer Communications and Networks (ICCCN 2016)},
+ date-added = {2017-02-26 19:02:21 +0000},
+ date-modified = {2020-01-04 22:58:02 -0700},
+ keywords = {papers, networking, congestion, datacenter},
+ month = {August 1-4},
+ title = {TCP Inigo: Ambidextrous Congestion Control},
+ year = {2016},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAwLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1Mvc2hld21ha2VyLWljY2NuMTYucGRmTxEBdAAAAAABdAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////FXNoZXdtYWtlci1pY2NjbjE2LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////U2GfqAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFTAAACADovOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOlM6c2hld21ha2VyLWljY2NuMTYucGRmAA4ALAAVAHMAaABlAHcAbQBhAGsAZQByAC0AaQBjAGMAYwBuADEANgAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAOFVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL1Mvc2hld21ha2VyLWljY2NuMTYucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFcAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABzw==},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA3Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1Mvc2hld21ha2VyLWljY2NuMTYtc2xpZGVzLnBkZk8RAZIAAAAAAZIAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xxzaGV3bWFrZXItaWNjY24xNi1zbGlkZXMucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2keCUgAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABUwAAAgBBLzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpTOnNoZXdtYWtlci1pY2NjbjE2LXNsaWRlcy5wZGYAAA4AOgAcAHMAaABlAHcAbQBhAGsAZQByAC0AaQBjAGMAYwBuADEANgAtAHMAbABpAGQAZQBzAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA/VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvUy9zaGV3bWFrZXItaWNjY24xNi1zbGlkZXMucGRmAAATAAEvAAAVAAIAEf//AAAACAANABoAJABeAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAfQ=}}
+
+@article{jimenez:login16,
+ abstract = {Independently validating experimental results in the field of computer systems research is a challenging task. Recreating an environment that resembles the one where an experiment was originally executed is a time-consuming endeavor. In this article, we present Popper, a convention (or protocol) for conducting experiments following a DevOps approach that allows researchers to make all associated artifacts publicly available with the goal of maximizing automation in the re-execution of an experiment and validation of its results.},
+ author = {Ivo Jimenez and Michael Sevilla and Noah Watkins and Carlos Maltzahn and Jay Lofstead and Kathryn Mohror and Remzi Arpaci-Dusseau and Andrea Arpaci-Dusseau},
+ date-added = {2017-01-17 23:58:32 +0000},
+ date-modified = {2020-01-04 21:44:35 -0700},
+ journal = {USENIX ;login:},
+ keywords = {papers, reproducibility, devops, versioning},
+ number = {4},
+ pages = {20--26},
+ title = {Standing on the Shoulders of Giants by Managing Scientific Experiments Like Software},
+ volume = {41},
+ year = {2016},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAwLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0ktSi9qaW1lbmV6LWxvZ2luMTYucGRmTxEBcgAAAAABcgACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////E2ppbWVuZXotbG9naW4xNi5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////UZcFaAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAANJLUoAAAIAOi86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6SS1KOmppbWVuZXotbG9naW4xNi5wZGYADgAoABMAagBpAG0AZQBuAGUAegAtAGwAbwBnAGkAbgAxADYALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADhVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9JLUovamltZW5lei1sb2dpbjE2LnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABXAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAc0=}}
+
+@article{klasky:jp16,
+ abstract = {As the exascale computing age emerges, data related issues are becoming critical factors that determine how and where we do computing. Popular approaches used by traditional I/O solution and storage libraries become increasingly bottlenecked due to their assumptions about data movement, re-organization, and storage. While, new technologies, such as ``burst buffers'', can help address some of the short-term performance issues, it is essential that we reexamine the underlying storage and I/O infrastructure to effectively support requirements and challenges at exascale and beyond. In this paper we present a new approach to the exascale Storage System and I/O (SSIO), which is based on allowing users to inject application knowledge into the system and leverage this knowledge to better manage, store, and access large data volumes so as to minimize the time to scientific insights. Central to our approach is the distinction between the data, metadata, and the knowledge contained therein, transferred from the user to the system by describing ``utility'' of data as it ages.},
+ author = {Scott A. Klasky and Hasan Abbasi and Mark Ainsworth and J. Choi and Matthew Curry and T. Kurc and Qing Liu and Jay Lofstead and Carlos Maltzahn and Manish Parashar and Norbert Podhorszki and Eric Suchyta and Fang Wang and Matthew Wolf and C. S. Chang and M. Churchill and S. Ethier},
+ date-added = {2017-01-14 20:46:38 +0000},
+ date-modified = {2020-01-04 21:45:50 -0700},
+ journal = {J. Phys.: Conf. Ser.},
+ keywords = {papers, storage, exascale, systems, hpc},
+ month = {November 11},
+ number = {1},
+ pages = {012095},
+ title = {Exascale Storage Systems the SIRIUS Way},
+ volume = {759},
+ year = {2016},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAqLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0sva2xhc2t5LWpwMTYucGRmTxEBXAAAAAABXAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////D2tsYXNreS1qcDE2LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////Uo/q0AAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFLAAACADQvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOks6a2xhc2t5LWpwMTYucGRmAA4AIAAPAGsAbABhAHMAawB5AC0AagBwADEANgAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAMlVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0sva2xhc2t5LWpwMTYucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFEAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABsQ==}}
+
+@inproceedings{watkins:socc16-poster,
+ address = {Santa Clara, CA},
+ author = {Noah Watkins and Michael Sevilla and Ivo Jimenez and Neha Ohja and Peter Alvaro and Carlos Maltzahn},
+ booktitle = {SoCC'16},
+ date-added = {2016-12-21 23:16:32 +0000},
+ date-modified = {2020-01-04 21:46:57 -0700},
+ keywords = {shortpapers, declarative, storage, programmable},
+ month = {October 5-7},
+ title = {Brados: Declarative,Programmable Object Storage},
+ year = {2016},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA0Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1cvd2F0a2lucy1zb2NjMTYtcG9zdGVyLnBkZk8RAYQAAAAAAYQAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xl3YXRraW5zLXNvY2MxNi1wb3N0ZXIucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////1IBOWAAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABVwAAAgA+LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpXOndhdGtpbnMtc29jYzE2LXBvc3Rlci5wZGYADgA0ABkAdwBhAHQAawBpAG4AcwAtAHMAbwBjAGMAMQA2AC0AcABvAHMAdABlAHIALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADxVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9XL3dhdGtpbnMtc29jYzE2LXBvc3Rlci5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQAWwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHj}}
+
+@inproceedings{brummell:pmes16,
+ abstract = {To raise performance beyond Moore's law scaling, Approximate Computing reduces arithmetic quality to increase operations per second or per joule. It works on only a few applications. The quality-speed tradeoff seems inescapable; however, Unum Arithmetic simultaneously raises arithmetic quality yet reduces the number of bits required. Unums extend IEEE floats (type 1) or provide custom number systems to maximize information per bit (type 2). Unums achieve Approximate Computing cost savings without sacrificing answer quality.},
+ author = {Nic Brummell and John L. Gustafson and Andrew Klofas and Carlos Maltzahn and Andrew Shewmaker},
+ booktitle = {PMES 2016},
+ date-added = {2016-10-21 17:31:51 +0000},
+ date-modified = {2020-01-04 21:47:19 -0700},
+ keywords = {papers, math, computation},
+ month = {November 14},
+ title = {Unum Arithmetic: Better Math with Clearer Tradeoffs},
+ year = {2016},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAuLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0IvYnJ1bW1lbGwtcG1lczE2LnBkZk8RAWwAAAAAAWwAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xNicnVtbWVsbC1wbWVzMTYucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////1JvLAgAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABQgAAAgA4LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpCOmJydW1tZWxsLXBtZXMxNi5wZGYADgAoABMAYgByAHUAbQBtAGUAbABsAC0AcABtAGUAcwAxADYALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADZVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9CL2JydW1tZWxsLXBtZXMxNi5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQAVQAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHF}}
+
+@inproceedings{hacker:wrfws16,
+ address = {Boulder, CO},
+ author = {Josh Hacker and John Exby and David Gill and Ivo Jimenez and Carlos Maltzahn and Tim See and Gretchen Mullendore},
+ booktitle = {17th annual WRF Users Workshop},
+ date-added = {2016-10-19 08:18:01 +0000},
+ date-modified = {2016-10-19 08:22:45 +0000},
+ month = {June 27 - July 2},
+ title = {Collaborative WRF-based research and education with reproducible numerical weather prediction enabled by software containers},
+ year = {2016},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA0Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0gvaGFja2VyLXdyZndzMTYtc2xpZGVzLnBkZk8RAYQAAAAAAYQAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xloYWNrZXItd3Jmd3MxNi1zbGlkZXMucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////3+T0tAAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABSAAAAgA+LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpIOmhhY2tlci13cmZ3czE2LXNsaWRlcy5wZGYADgA0ABkAaABhAGMAawBlAHIALQB3AHIAZgB3AHMAMQA2AC0AcwBsAGkAZABlAHMALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADxVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9IL2hhY2tlci13cmZ3czE2LXNsaWRlcy5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQAWwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHj},
+ bdsk-url-1 = {http://www2.mmm.ucar.edu/wrf/users/workshops/WS2016/oral_presentations/4.3.pdf}}
+
+@inproceedings{hacker:ams16,
+ author = {Josh Hacker and John Exby and Nick Chartier and David Gill and Ivo Jimenez and Carlos Maltzahn and Gretchen Mullendore},
+ booktitle = {American Meteorological Society 32nd Conference on Environmental Processing Technologies},
+ date-added = {2016-10-19 08:14:20 +0000},
+ date-modified = {2019-12-26 16:07:15 -0800},
+ keywords = {papers, reproducibility, containers},
+ month = {January},
+ title = {Collaborative Research and Education with Numerical Weather Prediction Enabled by Software Containers},
+ year = {2016},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxArLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0gvaGFja2VyLWFtczE2LnBkZk8RAWIAAAAAAWIAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xBoYWNrZXItYW1zMTYucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////3+Tg3gAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABSAAAAgA1LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpIOmhhY2tlci1hbXMxNi5wZGYAAA4AIgAQAGgAYQBjAGsAZQByAC0AYQBtAHMAMQA2AC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgAzVXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvSC9oYWNrZXItYW1zMTYucGRmAAATAAEvAAAVAAIAEf//AAAACAANABoAJABSAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbg=},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAyLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0gvaGFja2VyLWFtczE2LXNsaWRlcy5wZGZPEQF8AAAAAAF8AAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8XaGFja2VyLWFtczE2LXNsaWRlcy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9/k4JUAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUgAAAIAPC86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6SDpoYWNrZXItYW1zMTYtc2xpZGVzLnBkZgAOADAAFwBoAGEAYwBrAGUAcgAtAGEAbQBzADEANgAtAHMAbABpAGQAZQBzAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA6VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvSC9oYWNrZXItYW1zMTYtc2xpZGVzLnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABZAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAdk=}}
+
+@inproceedings{watkins:pdsw15,
+ abstract = {Traditionally storage has not been part of a programming model's semantics and is added only as an I/O library interface. As a result, programming models, languages, and storage systems are limited in the optimizations they can perform for I/O operations, as the semantics of the I/O library is typically at the level of transfers of blocks of uninterpreted bits, with no accompanying knowledge of how those bits are used by the application. For many HPC applications where I/O operations for analyzing and checkpointing large data sets are a non-negligible portion of the overall execution time, such a ``know nothing'' I/O design has negative performance implications.
+We propose an alternative design where the I/O semantics are integrated as part of the programming model, and a common data model is used throughout the entire memory and storage hierarchy enabling storage and application level co-optimizations. We demonstrate these ideas through the integration of storage services within the Legion [2] runtime and present preliminary results demonstrating the integration.},
+ address = {Austin, TX},
+ author = {Noah Watkins and Zhihao Jia and Galen Shipman and Carlos Maltzahn and Alex Aiken and Pat McCormick},
+ booktitle = {PDSW'15},
+ date-added = {2016-08-31 06:03:13 +0000},
+ date-modified = {2020-01-04 21:48:24 -0700},
+ keywords = {papers, storage, systems, optimization, parallel, distributed, runtime},
+ month = {November 16},
+ title = {Automatic and transparent I/O optimization with storage integrated application runtime support},
+ year = {2015},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAtLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1cvd2F0a2lucy1wZHN3MTUucGRmTxEBagAAAAABagACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////EndhdGtpbnMtcGRzdzE1LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////T68I/AAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFXAAACADcvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOlc6d2F0a2lucy1wZHN3MTUucGRmAAAOACYAEgB3AGEAdABrAGkAbgBzAC0AcABkAHMAdwAxADUALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADVVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9XL3dhdGtpbnMtcGRzdzE1LnBkZgAAEwABLwAAFQACABH//wAAAAgADQAaACQAVAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHC}}
+
+@techreport{watkins:ucsctr16,
+ abstract = {As applications scale to new levels and migrate into cloud environments, there has been a significant departure from the exclusive reliance on the POSIX file I/O interface. However in doing so, application often discover a lack of services, forcing them to use bolt-on features or take on the responsibility of critical data management tasks. This often results in duplication of complex software with extreme correctness requirements. Instead, wouldn't it be nice if an application could just convey what it wanted out of a storage system, and have the storage system understand?
+The central question we address in this paper is whether or not the design delta between two storage systems can be expressed in a form such that one system becomes little more than a configuration of the other. Storage systems should expose their useful services in a way that separates performance from correctness, allowing for their safe reuse. After all, hardened code in storage systems protects countless value, and its correctness is only as good as the stress we place on it. We demonstrate these concepts by synthesizing the CORFU high-performance shared-log abstraction in Ceph through minor modifications of existing sub-systems that are orthogonal to correctness.},
+ address = {Santa Cruz, CA},
+ author = {Noah Watkins and Michael Sevilla and Carlos Maltzahn},
+ date-added = {2016-08-26 18:45:34 +0000},
+ date-modified = {2020-01-04 21:48:55 -0700},
+ institution = {UC Santa Cruz},
+ keywords = {papers, programmable, storage, systems},
+ month = {June 11},
+ number = {UCSC-SOE-15-12},
+ title = {The Case for Programmable Object Storage Systems},
+ type = {Tech. rept.},
+ year = {2016},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAvLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1cvd2F0a2lucy11Y3NjdHIxNi5wZGZPEQFyAAAAAAFyAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8Ud2F0a2lucy11Y3NjdHIxNi5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9Pl3MoAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAVcAAAIAOS86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6Vzp3YXRraW5zLXVjc2N0cjE2LnBkZgAADgAqABQAdwBhAHQAawBpAG4AcwAtAHUAYwBzAGMAdAByADEANgAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAN1VzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL1cvd2F0a2lucy11Y3NjdHIxNi5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFYAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABzA==}}
+
+@unpublished{maltzahn:si2ws-poster16,
+ author = {Carlos Maltzahn and others},
+ date-added = {2016-08-18 06:04:41 +0000},
+ date-modified = {2020-01-04 21:49:20 -0700},
+ keywords = {shortpapers, overview, bigdata, reproducibility},
+ month = {February 16},
+ note = {Poster at SI2 Workshop},
+ title = {Big Weather Web: A common and sustainable big data infrastructure in support of weather prediction research and education in universities},
+ year = {2016},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA2Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL00vbWFsdHphaG4tc2kyd3MtcG9zdGVyMTYucGRmTxEBjAAAAAABjAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////G21hbHR6YWhuLXNpMndzLXBvc3RlcjE2LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////T2p37AAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFNAAACAEAvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOk06bWFsdHphaG4tc2kyd3MtcG9zdGVyMTYucGRmAA4AOAAbAG0AYQBsAHQAegBhAGgAbgAtAHMAaQAyAHcAcwAtAHAAbwBzAHQAZQByADEANgAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAPlVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL00vbWFsdHphaG4tc2kyd3MtcG9zdGVyMTYucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAF0AAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAAB7Q==}}
+
+@techreport{jimenez:ucsctr16,
+ abstract = {Independent validation of experimental results in the field of parallel and distributed systems research is a challenging task, mainly due to changes and differences in software and hardware in computational environments. Recreating an environment that resembles the original systems research is difficult and time-consuming. In this paper we introduce the Popper Convention, a set of principles for producing scientific publications. Concretely, we make the case for treating an article as an open source software (OSS) project, applying software engineering best-practices to manage its associated artifacts and maintain the reproducibility of its findings. Leveraging existing cloud-computing infrastructure and modern OSS development tools to produce academic articles that are easy to validate. We present our prototype file system, GassyFS, as a use case for illustrating the usefulness of this approach. We show how, by following Popper, re-executing experiments on multiple platforms is more practical, allowing reviewers and students to quickly get to the point of getting results without relying on the author's intervention.},
+ address = {Santa Cruz, CA},
+ author = {Ivo Jimenez and Michael Sevilla and Noah Watkins and Carlos Maltzahn},
+ date-added = {2016-08-18 05:58:51 +0000},
+ date-modified = {2020-01-04 21:49:52 -0700},
+ institution = {UC Santa Cruz},
+ keywords = {papers, reproducibility, systems, evaluation},
+ month = {May 19},
+ number = {UCSC-SOE-16-10},
+ title = {Popper: Making Reproducible Systems Performance Evaluation Practical},
+ type = {Tech. rept.},
+ year = {2016},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAxLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0ktSi9qaW1lbmV6LXVjc2N0cjE2LnBkZk8RAXgAAAAAAXgAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xRqaW1lbmV6LXVjc2N0cjE2LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////09qSCgAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAADSS1KAAACADsvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkktSjpqaW1lbmV6LXVjc2N0cjE2LnBkZgAADgAqABQAagBpAG0AZQBuAGUAegAtAHUAYwBzAGMAdAByADEANgAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAOVVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0ktSi9qaW1lbmV6LXVjc2N0cjE2LnBkZgAAEwABLwAAFQACABH//wAAAAgADQAaACQAWAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHU}}
+
+@inproceedings{jimenez:varsys16,
+ abstract = {Independent validation of experimental results in the field of parallel and distributed systems research is a challenging task, mainly due to changes and differences in software and hardware in computational environments. In particular, when an experiment runs on different hardware than the one where it originally executed, predicting the differences in results is difficult. In this paper, we introduce an architecture-independent method for characterizing the performance of a machine by obtaining a profile (a vector of microbenchark results) that we use to quantify the variability between two hardware platforms. We propose the use of isolation features that OS-level virtualization offers to reduce the variability observed when validating application performance across multiple machines. Our results show that, using our variability characterization methodology, we can correctly predict the variability bounds of CPU-intensive applications, as well as reduce it by up to 2.8x if we make use of CPU bandwidth limitations, depending on the opcode mix of an application, as well as generational and architectural differences between two hardware platforms.},
+ address = {Chicago, IL},
+ author = {Ivo Jimenez and Carlos Maltzahn and Jay Lofstead and Adam Moody and Kathryn Mohror and Remzi Arpaci-Dusseau and Andrea Arpaci-Dusseau},
+ booktitle = {VarSys'16},
+ date-added = {2016-05-19 13:24:07 +0000},
+ date-modified = {2020-01-04 21:50:21 -0700},
+ keywords = {papers, reproducibility,},
+ month = {May 23},
+ title = {Characterizing and Reducing Cross-Platform Performance Variability Using OS-level Virtualization},
+ year = {2016},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAxLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0ktSi9qaW1lbmV6LXZhcnN5czE2LnBkZk8RAXgAAAAAAXgAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xRqaW1lbmV6LXZhcnN5czE2LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////09yhCAAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAADSS1KAAACADsvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkktSjpqaW1lbmV6LXZhcnN5czE2LnBkZgAADgAqABQAagBpAG0AZQBuAGUAegAtAHYAYQByAHMAeQBzADEANgAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAOVVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0ktSi9qaW1lbmV6LXZhcnN5czE2LnBkZgAAEwABLwAAFQACABH//wAAAAgADQAaACQAWAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHU}}
+
+@inproceedings{manzanares:hotstorage16,
+ address = {Denver, CO},
+ author = {Manzanares and Noah Watkins and Cyril Guyot and Damien LeMoal and Carlos Maltzahn and Zvonimir Bandic},
+ booktitle = {HotStorage '16},
+ date-added = {2016-05-17 21:34:02 +0000},
+ date-modified = {2016-05-17 21:36:35 +0000},
+ keywords = {papers, storagemedium, shingledrecording, os, allocation},
+ month = {June 20-21},
+ title = {ZEA, A Data Management Approach for SMR},
+ year = {2016},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA2Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL00vbWFuemFuYXJlcy1ob3RzdG9yYWdlMTYucGRmTxEBjAAAAAABjAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////G21hbnphbmFyZXMtaG90c3RvcmFnZTE2LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////TYNeCAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFNAAACAEAvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOk06bWFuemFuYXJlcy1ob3RzdG9yYWdlMTYucGRmAA4AOAAbAG0AYQBuAHoAYQBuAGEAcgBlAHMALQBoAG8AdABzAHQAbwByAGEAZwBlADEANgAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAPlVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL00vbWFuemFuYXJlcy1ob3RzdG9yYWdlMTYucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAF0AAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAAB7Q==}}
+
+@inproceedings{sevilla:sc15,
+ abstract = {Migrating resources is a useful tool for balancing load in a distributed system, but it is difficult to determine when to move resources, where to move resources, and how much of them to move. We look at resource migration for file system metadata and show how CephFS's dynamic subtree partitioning approach can exploit varying degrees of locality and balance because it can partition the namespace into variable sized units. Unfortunately, the current metadata balancer is complicated and difficult to control because it struggles to address many of the general resource migration challenges inherent to the metadata management problem. To help decouple policy from mechanism, we introduce a programmable storage system that lets the designer inject custom balancing logic. We show the flexibility and transparency of this approach by replicating the strategy of a state-of-the-art metadata balancer and conclude by comparing this strategy to other custom balancers on the same system.},
+ address = {Austin, TX},
+ author = {Michael Sevilla and Noah Watkins and Carlos Maltzahn and Ike Nassi and Scott Brandt and Sage Weil and Greg Farnum and Sam Fineberg},
+ booktitle = {SC '15},
+ date-added = {2015-07-11 20:49:14 +0000},
+ date-modified = {2020-01-04 21:51:04 -0700},
+ keywords = {papers, metadata, management, loadbalancing, programmable, distributed, systems},
+ month = {November},
+ title = {Mantle: A Programmable Metadata Load Balancer for the Ceph File System},
+ year = {2015},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxArLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1Mvc2V2aWxsYS1zYzE1LnBkZk8RAWIAAAAAAWIAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xBzZXZpbGxhLXNjMTUucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2fDvTAAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABUwAAAgA1LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpTOnNldmlsbGEtc2MxNS5wZGYAAA4AIgAQAHMAZQB2AGkAbABsAGEALQBzAGMAMQA1AC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgAzVXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvUy9zZXZpbGxhLXNjMTUucGRmAAATAAEvAAAVAAIAEf//AAAACAANABoAJABSAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbg=}}
+
+@techreport{watkins:ucsctr15,
+ abstract = {As applications scale to new levels and migrate into cloud environments, there has been a significant departure from the exclusive reliance on the POSIX file I/O interface. However in doing so, application often discover a lack of services, forcing them to use bolt-on features or take on the respon- sibility of critical data management tasks. This often results in duplication of complex software with extreme correctness requirements. Instead, wouldn't it be nice if an application could just convey what it wanted out of a storage system, and have the storage system understand?
+The central question we address in this paper is whether or not the design delta between two storage systems can be expressed in a form such that one system becomes lit- tle more than a configuration of the other. Storage systems should expose their useful services in a way that separates performance from correctness, allowing for their safe reuse. After all, hardened code in storage systems protects count- less value, and its correctness is only as good as the stress we place on it. We demonstrate these concepts by synthesiz- ing the CORFU high-performance shared-log abstraction in Ceph through minor modifications of existing sub-systems that are orthogonal to correctness.},
+ author = {Noah Watkins and Michael Sevilla and Carlos Maltzahn},
+ date-added = {2015-06-11 07:31:24 +0000},
+ date-modified = {2020-01-04 21:51:36 -0700},
+ institution = {UC Santa Cruz},
+ keywords = {papers, programmable, storage, systems},
+ month = {June 11},
+ number = {UCSC-SOE-15-12},
+ title = {The Case for Programmable Object Storage Systems},
+ type = {Tech. rept.},
+ year = {2015},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAvLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1cvd2F0a2lucy11Y3NjdHIxNS5wZGZPEQFyAAAAAAFyAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8Ud2F0a2lucy11Y3NjdHIxNS5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9Geh5oAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAVcAAAIAOS86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6Vzp3YXRraW5zLXVjc2N0cjE1LnBkZgAADgAqABQAdwBhAHQAawBpAG4AcwAtAHUAYwBzAGMAdAByADEANQAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAN1VzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL1cvd2F0a2lucy11Y3NjdHIxNS5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFYAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABzA==}}
+
+@inproceedings{skourtis:inflow14,
+ abstract = {We want to create a scalable flash storage system that provides read/write separation and uses erasure coding to provide reliability without the storage cost of replication. Flash on Rails [19] is a system for enabling consistent performance in flash storage by physically separating reads from writes through redundancy. In principle, Rails supports erasure codes. However, it has only been evaluated using replication in small arrays, so it is currently uncertain how it would scale with erasure coding.
+In this work we consider the applicability of erasure coding in Rails, in a new system called eRails. We consider the effects of computation due to encoding/decoding on the raw performance, as well as its effect on performance consistency. We demonstrate that up to a certain number of drives the performance remains unaffected while the computation cost remains modest. After that point, the computational cost grows quickly due to coding itself making further scaling inefficient. To support an arbitrary number of drives we present a design allowing us to scale eRails by constructing overlapping erasure coding groups that preserve read/write separation. Finally, through benchmarks we demonstrate that eRails achieves read/write separation and consistent read performance under read/write workloads.
+},
+ address = {Broomfield, CO},
+ author = {Dimitris Skourtis and Dimitris Achlioptas and Noah Watkins and Carlos Maltzahn and Scott Brandt},
+ booktitle = {INFLOW '14 (at OSDI'14)},
+ date-added = {2014-12-06 21:50:01 +0000},
+ date-modified = {2020-01-04 21:52:42 -0700},
+ keywords = {papers, erasurecodes, performance, flash, garbagecollection, predictable},
+ month = {October 5},
+ title = {Erasure Coding \& Read/Write Separation in Flash Storage},
+ year = {2014},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAwLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1Mvc2tvdXJ0aXMtaW5mbG93MTQucGRmTxEBdAAAAAABdAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////FXNrb3VydGlzLWluZmxvdzE0LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////ashYZAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFTAAACADovOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOlM6c2tvdXJ0aXMtaW5mbG93MTQucGRmAA4ALAAVAHMAawBvAHUAcgB0AGkAcwAtAGkAbgBmAGwAbwB3ADEANAAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAOFVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL1Mvc2tvdXJ0aXMtaW5mbG93MTQucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFcAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABzw==}}
+
+@techreport{shewmaker:ucsctr14,
+ abstract = {The RUN (Reduction to UNiprocessor) [18, 19, 13] algorithm was first described by Regnier, et al. as a novel and elegant solution to real-time multiprocessor scheduling. The first practical implementation of RUN [3] created by Compagnin, et. al., both verified the simulation results and showed that it can be efficiently implemented on top of standard operating system primitives. While RUN is now the proven best solution for scheduling fixed rate tasks on multiprocessors, it can also be applied to other resources. This technical report briefly describes RUN and how it could be used in any situation involving an array of multiple resources where some form of preemptions and migrations are allowed (although must be minimized). It also describes how buffers can be sanity checked in a system where a RUN-scheduled resource is consuming data from another RUN-scheduled resource.
+},
+ address = {Santa Cruz, CA},
+ author = {Andrew Shewmaker and Carlos Maltzahn and Katia Obraczka and Scott Brandt},
+ date-added = {2014-09-06 04:13:59 +0000},
+ date-modified = {2020-01-04 21:53:19 -0700},
+ institution = {University of California at Santa Cruz},
+ keywords = {papers, scheduling, networking, realtime, performance, management},
+ month = {July 23},
+ number = {UCSC-SOE-14-08},
+ title = {Run, Fatboy, Run: Applying the Reduction to Uniprocessor Algorithm to Other Wide Resources},
+ type = {Tech. rept.},
+ year = {2014},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAxLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1Mvc2hld21ha2VyLXVjc2N0cjE0LnBkZk8RAXoAAAAAAXoAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xZzaGV3bWFrZXItdWNzY3RyMTQucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////0C/YwgAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABUwAAAgA7LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpTOnNoZXdtYWtlci11Y3NjdHIxNC5wZGYAAA4ALgAWAHMAaABlAHcAbQBhAGsAZQByAC0AdQBjAHMAYwB0AHIAMQA0AC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA5VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvUy9zaGV3bWFrZXItdWNzY3RyMTQucGRmAAATAAEvAAAVAAIAEf//AAAACAANABoAJABYAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAdY=}}
+
+@inproceedings{sevilla:lspp14,
+ abstract = {Reading input from primary storage (i.e. the ingest phase) and aggregating results (i.e. the merge phase) are important pre- and post-processing steps in large batch computations. Unfortunately, today's data sets are so large that the ingest and merge job phases are now performance bottlenecks. In this paper, we mitigate the ingest and merge bottlenecks by leveraging the scale-up MapReduce model. We introduce an ingest chunk pipeline and a merge optimization that increases CPU utilization (50 - 100\%) and job phase speedups (1.16x - 3.13x) for the ingest and merge phases. Our techniques are based on well-known algorithms and scale-out MapReduce optimizations, but applying them to a scale-up computation framework to mitigate the ingest and merge bottlenecks is novel.},
+ address = {Phoenix, AZ},
+ author = {Michael Sevilla and Ike Nassi and Kleoni Ioannidou and Scott Brandt and Carlos Maltzahn},
+ booktitle = {LSPP at IPDPS 2014},
+ date-added = {2014-07-11 20:56:28 +0000},
+ date-modified = {2020-01-04 21:54:00 -0700},
+ keywords = {papers, mapreduce, sharedmemory, performance},
+ month = {May 23},
+ title = {SupMR: Circumventing Disk and Memory Bandwidth Bottlenecks for Scale-up MapReduce},
+ year = {2014},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAtLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1Mvc2V2aWxsYS1sc3BwMTQucGRmTxEBagAAAAABagACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////EnNldmlsbGEtbHNwcDE0LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////YbSM7AAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFTAAACADcvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOlM6c2V2aWxsYS1sc3BwMTQucGRmAAAOACYAEgBzAGUAdgBpAGwAbABhAC0AbABzAHAAcAAxADQALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADVVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9TL3NldmlsbGEtbHNwcDE0LnBkZgAAEwABLwAAFQACABH//wAAAAgADQAaACQAVAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHC}}
+
+@inproceedings{sevilla:discs13,
+ abstract = {When data grows too large, we scale to larger systems, either by scaling out or up. It is understood that scale-out and scale-up have different complexities and bottlenecks but a thorough comparison of the two architectures is challenging because of the diversity of their programming interfaces, their significantly different system environments, and their sensitivity to workload specifics. In this paper, we propose a novel comparison framework based on MapReduce that accounts for the application, its requirements, and its input size by considering input, software, and hardware parameters. Part of this framework requires implementing scale-out properties on scale-up and we discuss the complex trade-offs, interactions, and dependencies of these properties for two specific case studies (word count and sort). This work lays the foundation for future work in quantifying design decisions and in building a system that automatically compares architectures and selects the best one.},
+ address = {Denver, CO},
+ author = {Micheal Sevilla and Ike Nassi and Kleoni Ioannidou and Scott Brandt and Carlos Maltzahn},
+ booktitle = {DISCS 2013 at SC13},
+ date-added = {2014-07-11 20:53:58 +0000},
+ date-modified = {2020-01-04 21:55:43 -0700},
+ keywords = {papers, scalable, systems, distributed, sharedmemory},
+ month = {November 18},
+ title = {A Framework for an In-depth Comparison of Scale-up and Scale-out},
+ year = {2013},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAuLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1Mvc2V2aWxsYS1kaXNjczEzLnBkZk8RAWwAAAAAAWwAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xNzZXZpbGxhLWRpc2NzMTMucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2G0jEQAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABUwAAAgA4LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpTOnNldmlsbGEtZGlzY3MxMy5wZGYADgAoABMAcwBlAHYAaQBsAGwAYQAtAGQAaQBzAGMAcwAxADMALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADZVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9TL3NldmlsbGEtZGlzY3MxMy5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQAVQAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHF}}
+
+@article{rose:sej91,
+ abstract = {In the context of the ESPRIT project DAIDA, we have developed an experimental environment intended to achieve consistency-in-the-large in a multi-person setting. Our conceptual model of configuration processes, the CAD$\,^{\circ}$ model, centres around decisions that work on configured objects and are subject to structured conversations. The environment, extending the knowledge-based software information system ConceptBase, supports co-operation within development teams by integrating models and tools for argumentation and co-ordination with those for versioning and configuration. Versioning decisions are discussed and decided on within an argument editor, and executed by specialised tools for programming-in-the-small. Tasks are assigned and monitored through a contract tool, and carried out within co-ordinated workspaces under a conflict-tolerant transaction protocol. Consistent configuration and reconfiguration of local results is supported by a logic-based configuration assistant.},
+ author = {Thomas Rose and Matthias Jarke and Martin Gocek and Carlos Maltzahn and Hans Nissen},
+ date-added = {2014-06-27 02:43:48 +0000},
+ date-modified = {2020-01-04 21:56:48 -0700},
+ journal = {Software Engineering Journal},
+ keywords = {papers, software, programming, collaborative},
+ month = {September},
+ number = {5},
+ pages = {332--346},
+ title = {A Decision-Based Configuration Process Environment},
+ volume = {6},
+ year = {1991},
+ bdsk-url-1 = {http://dx.doi.org/10.1049/sej.1991.0034}}
+
+@misc{hacker:ncar14,
+ author = {Joshua Hacker and Carlos Maltzahn and Gretchen Mullendore and Russ Schumacher},
+ date-added = {2014-06-21 21:53:41 +0000},
+ date-modified = {2014-06-24 17:21:49 +0000},
+ howpublished = {Web page. www.rap.ucar.edu/staff/hacker/BigWeather.pdf},
+ keywords = {papers, nwp, geoscience, simulation, infrastructure},
+ month = {January},
+ title = {Big Weather - A workshop on overcoming barriers to distributed production, storage, and analysis of multi-model ensemble forecasts in support of weather prediction research and education in universities},
+ year = {2014},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAsLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0gvaGFja2VyLW5jYXIxNC5wZGZPEQFkAAAAAAFkAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8RaGFja2VyLW5jYXIxNC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////8+JiokAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUgAAAIANi86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6SDpoYWNrZXItbmNhcjE0LnBkZgAOACQAEQBoAGEAYwBrAGUAcgAtAG4AYwBhAHIAMQA0AC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA0VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvSC9oYWNrZXItbmNhcjE0LnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABTAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbs=}}
+
+@inproceedings{skourtis:atc14,
+ abstract = {Modern applications and virtualization require fast and predictable storage. Hard-drives have low and unpredictable performance, while keeping everything in DRAM is still prohibitively expensive or unnecessary in many cases. Solid-state drives offer a balance between performance and cost and are becoming increasingly popular in storage systems, playing the role of large caches and permanent storage. Although their read performance is high and predictable, SSDs frequently block in the presence of writes, exceeding hard-drive latency and leading to unpredictable performance.
+Many systems with mixed workloads have low latency requirements or require predictable performance and guarantees. In such cases the performance variance of SSDs becomes a problem for both predictability and raw performance. In this paper, we propose Rails, a design based on redundancy, which provides predictable performance and low latency for reads under read/write workloads by physically separating reads from writes. More specifically, reads achieve read-only performance while writes perform at least as well as before. We evaluate our design using micro-benchmarks and real traces, illustrating the performance benefits of Rails and read/write separation in solid-state drives.},
+ address = {Philadelphia, PA},
+ author = {Dimitris Skourtis and Dimitris Achlioptas and Noah Watkins and Carlos Maltzahn and Scott Brandt},
+ booktitle = {USENIX ATC '14},
+ date-added = {2014-05-10 00:06:33 +0000},
+ date-modified = {2020-01-04 21:58:01 -0700},
+ keywords = {papers, flash, performance, redundancy, qos},
+ month = {June 19-20},
+ title = {Flash on Rails: Consistent Flash Performance through Redundancy},
+ year = {2014},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAtLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1Mvc2tvdXJ0aXMtYXRjMTQucGRmTxEBagAAAAABagACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////EnNrb3VydGlzLWF0YzE0LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////TPv/BAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFTAAACADcvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOlM6c2tvdXJ0aXMtYXRjMTQucGRmAAAOACYAEgBzAGsAbwB1AHIAdABpAHMALQBhAHQAYwAxADQALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADVVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9TL3Nrb3VydGlzLWF0YzE0LnBkZgAAEwABLwAAFQACABH//wAAAAgADQAaACQAVAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHC}}
+
+@inproceedings{crume:msst14,
+ abstract = {Predicting access times is a crucial part of predicting hard disk drive performance. Existing approaches use white-box modeling and require intimate knowledge of the internal layout of the drive, which can take months to extract. Automatically learning this behavior is a much more desirable approach, requiring less expert knowledge, fewer assumptions, and less time. While previous research has created black-box models of hard disk drive performance, none have shown low per-request errors. A barrier to machine learning of access times has been the existence of periodic behavior with high, unknown frequencies. We identify these high frequencies with Fourier analysis and include them explicitly as input to the model. In this paper we focus on the simulation of access times for random read workloads within a single zone. We are able to automatically generate and tune request-level access time models with mean absolute error less than 0.15 ms. To our knowledge this is the first time such a fidelity has been achieved with modern disk drives using machine learning. We are confident that our approach forms the core for automatic generation of access time models that include other workloads and span across entire disk drives, but more work remains.},
+ address = {Santa Clara, CA},
+ author = {Adam Crume and Carlos Maltzahn and Lee Ward and Thomas Kroeger and Matthew Curry},
+ booktitle = {MSST '14},
+ date-added = {2014-05-10 00:02:27 +0000},
+ date-modified = {2020-01-04 21:58:30 -0700},
+ keywords = {papers, machinelearning, modeling, simulation, storagemedium, autotuning},
+ month = {June 2-6},
+ title = {Automatic Generation of Behavioral Hard Disk Drive Access Time Models},
+ year = {2014},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxArLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0MvY3J1bWUtbXNzdDE0LnBkZk8RAWIAAAAAAWIAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xBjcnVtZS1tc3N0MTQucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2gRZdAAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABQwAAAgA1LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpDOmNydW1lLW1zc3QxNC5wZGYAAA4AIgAQAGMAcgB1AG0AZQAtAG0AcwBzAHQAMQA0AC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgAzVXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvQy9jcnVtZS1tc3N0MTQucGRmAAATAAEvAAAVAAIAEf//AAAACAANABoAJABSAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbg=}}
+
+@inproceedings{maltzahn:gamifir14,
+ abstract = {The super-exponential growth of digital data world-wide is matched by personal digital archives containing songs, ebooks, audio books, photos, movies, textual documents, and documents of other media types. For many types of media it is usually a lot easier to add items than to keep archives from falling into disarray and incurring data loss. The overhead of maintaining these personal archives frequently surpasses the time and patience their owners are willing to dedicate to this important task. The promise of gamification in this context is to significantly extend the willingness to maintain personal archives by enhancing the experience of personal archive management.
+In this paper we focus on a subcategory of personal archives which we call private archives. These are archives that for a variety of reasons the owner does not want to make available online and which consequently limits archive maintenance to an individual activity and does not allow any form of crowdsourcing out of fear for unwanted information leaks. As an example of private digital archive maintenance gamification we describe InfoGarden, a casual game that turns document tagging into an individual activity of (metaphorically) weeding a garden and protecting plants from gophers and includes a reward system that encourages orthogonal tag usage. The paper concludes with lessons learned and summarizes remaining challenges.},
+ address = {Amsterdam, Netherlands},
+ author = {Carlos Maltzahn and Arnav Jhala and Michael Mateas and Jim Whitehead},
+ booktitle = {GamifIR'14 at ECIR'14},
+ date-added = {2014-04-22 01:27:12 +0000},
+ date-modified = {2020-01-04 21:59:05 -0700},
+ keywords = {papers, gamification, games, archive, digitalpreservation, tagging},
+ month = {April 13},
+ title = {Gamification of Private Digital Data Archive Management},
+ year = {2014},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAxLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL00vbWFsdHphaG4tZ2FtaWZpcjE0LnBkZk8RAXoAAAAAAXoAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xZtYWx0emFobi1nYW1pZmlyMTQucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////z1XI+wAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABTQAAAgA7LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpNOm1hbHR6YWhuLWdhbWlmaXIxNC5wZGYAAA4ALgAWAG0AYQBsAHQAegBhAGgAbgAtAGcAYQBtAGkAZgBpAHIAMQA0AC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA5VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvTS9tYWx0emFobi1nYW1pZmlyMTQucGRmAAATAAEvAAAVAAIAEf//AAAACAANABoAJABYAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAdY=}}
+
+@techreport{crume:ucsctr14,
+ abstract = {Predicting access times is a crucial part of predicting hard disk drive performance. Existing approaches use white-box modeling and require intimate knowledge of the internal layout of the drive, which can take months to extract. Automatically learning this behavior is a much more desirable approach, requiring less expert knowledge, fewer assumptions, and less time. While previous research has created black-box models of hard disk drive performance, none have shown low per-request errors. A barrier to machine learning of access times has been the existence of periodic behavior with high, unknown frequencies. We identify these high frequencies with Fourier analysis and include them explicitly as input to the model. In this paper we focus on the simulation of access times for random read workloads within a single zone. We are able to automatically generate and tune request-level access time models with mean absolute error less than 0.15 ms. To our knowledge this is the first time such a fidelity has been achieved with modern disk drives using machine learning. We are confident that our approach forms the core for automatic generation of access time models that include other workloads and span across entire disk drives, but more work remains.},
+ address = {Santa Cruz, CA},
+ author = {Adam Crume and Carlos Maltzahn and Lee Ward and Thomas Kroeger and Matthew Curry},
+ date-added = {2014-03-28 22:23:23 +0000},
+ date-modified = {2020-01-04 21:59:33 -0700},
+ institution = {University of California at Santa Cruz},
+ keywords = {papers, machinelearning, storagemedium, simulation, modeling, autotuning, neuralnetworks},
+ month = {March 28},
+ number = {UCSC-SOE-14-02},
+ title = {Automatic Generation of Behavioral Hard Disk Drive Access Time Models},
+ type = {Technical Report},
+ year = {2014},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAtLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0MvY3J1bWUtdWNzY3RyMTQucGRmTxEBagAAAAABagACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////EmNydW1lLXVjc2N0cjE0LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////PW0ZKAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFDAAACADcvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkM6Y3J1bWUtdWNzY3RyMTQucGRmAAAOACYAEgBjAHIAdQBtAGUALQB1AGMAcwBjAHQAcgAxADQALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADVVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9DL2NydW1lLXVjc2N0cjE0LnBkZgAAEwABLwAAFQACABH//wAAAAgADQAaACQAVAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHC}}
+
+@inproceedings{jimenez:pdsw13poster,
+ address = {Denver, CO},
+ author = {Ivo Jimenez and Carlos Maltzahn and Jai Dayal and Jay Lofstead},
+ booktitle = {Poster Session at PDSW 2013 at SC13},
+ date-added = {2013-12-08 21:27:53 +0000},
+ date-modified = {2020-01-04 21:59:53 -0700},
+ keywords = {shortpapers, transactions, hpc, exascale, parallel, datamanagement},
+ month = {November 17},
+ title = {Exploring Trade-offs in Transactional Parallel Data Movement},
+ year = {2013},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA1Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0ktSi9qaW1lbmV6LXBkc3cxM3Bvc3Rlci5wZGZPEQGIAAAAAAGIAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8YamltZW5lei1wZHN3MTNwb3N0ZXIucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////87KJPYAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAA0ktSgAAAgA/LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpJLUo6amltZW5lei1wZHN3MTNwb3N0ZXIucGRmAAAOADIAGABqAGkAbQBlAG4AZQB6AC0AcABkAHMAdwAxADMAcABvAHMAdABlAHIALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASAD1Vc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9JLUovamltZW5lei1wZHN3MTNwb3N0ZXIucGRmAAATAAEvAAAVAAIAEf//AAAACAANABoAJABcAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAeg=}}
+
+@inproceedings{crume:pdsw13,
+ abstract = {Predicting access times is a crucial part of predicting hard disk drive performance. Existing approaches use white-box modeling and require intimate knowledge of the internal layout of the drive, which can take months to extract. Automatically learning this behavior is a much more desirable approach, requiring less expert knowledge, fewer assumptions, and less time. Others have created behavioral models of hard disk drive performance, but none have shown low per-request errors. A barrier to machine learning of access times has been the existence of periodic behavior with high, unknown frequencies. We show how hard disk drive access times can be predicted to within 0.83 ms using a neural net after these frequencies are found using Fourier analysis.
+},
+ address = {Denver, CO},
+ author = {Adam Crume and Carlos Maltzahn and Lee Ward and Thomas Kroeger and Matthew Curry and Ron Oldfield},
+ booktitle = {PDSW'13},
+ date-added = {2013-11-30 19:31:15 +0000},
+ date-modified = {2020-01-04 22:00:13 -0700},
+ keywords = {papers, machinelearning, performance, modeling, storagemedium, neuralnetworks},
+ month = {November 18},
+ title = {Fourier-Assisted Machine Learning of Hard Disk Drive Access Time Models},
+ year = {2013},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxArLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0MvY3J1bWUtcGRzdzEzLnBkZk8RAWIAAAAAAWIAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xBjcnVtZS1wZHN3MTMucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////zr99uQAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABQwAAAgA1LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpDOmNydW1lLXBkc3cxMy5wZGYAAA4AIgAQAGMAcgB1AG0AZQAtAHAAZABzAHcAMQAzAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgAzVXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvQy9jcnVtZS1wZHN3MTMucGRmAAATAAEvAAAVAAIAEf//AAAACAANABoAJABSAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbg=}}
+
+@inproceedings{skourtis:inflow13,
+ abstract = {Solid-state drives are becoming increasingly popular in enterprise storage systems, playing the role of large caches and permanent storage. Although SSDs provide faster random access than hard-drives, their performance under read/write workloads is highly variable often exceeding that of harddrives (e.g., taking 100ms for a single read). Many systems with mixed workloads have low latency requirements, or require predictable performance and guarantees. In such cases, the performance variance of SSDs becomes a problem for both predictability and raw performance.
+In this paper, we propose a design based on redundancy, which provides high performance and low latency for reads under read/write workloads by physically separating reads from writes. More specifically, reads achieve read-only performance while writes perform at least as good as before. We evaluate our design using micro-benchmarks and real traces, illustrating the performance benefits of read/write separation in solid-state drives.},
+ address = {Farmington, PA},
+ author = {Dimitris Skourtis and Dimitris Achlioptas and Carlos Maltzahn and Scott Brandt},
+ booktitle = {INFLOW '13},
+ date-added = {2013-09-11 06:19:23 +0000},
+ date-modified = {2020-01-04 22:04:04 -0700},
+ keywords = {papers, flash, erasurecodes, redundancy, storage, distributed, systems},
+ month = {November 3},
+ title = {High Performance \& Low Latency in Solid-State Drives Through Redundancy},
+ year = {2013},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAwLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1Mvc2tvdXJ0aXMtaW5mbG93MTMucGRmTxEBdAAAAAABdAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////FXNrb3VydGlzLWluZmxvdzEzLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////OXx4SAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFTAAACADovOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOlM6c2tvdXJ0aXMtaW5mbG93MTMucGRmAA4ALAAVAHMAawBvAHUAcgB0AGkAcwAtAGkAbgBmAGwAbwB3ADEAMwAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAOFVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL1Mvc2tvdXJ0aXMtaW5mbG93MTMucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFcAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABzw==}}
+
+@inproceedings{watkins:bdmc13,
+ abstract = {The emergence of high-performance open-source storage systems is allowing application and middleware developers to consider non-standard storage system interfaces. In contrast to the practice of virtually always designing for file-like byte-stream interfaces, co-designed domain-specific storage system interfaces are becoming increasingly common. However, in order for developers to evolve interfaces in high-availability storage systems, services are needed for in-vivo interface evolution that allows the development of interfaces in the context of a live system. Current clustered storage systems that provide interface customizability expose primitive services for managing ad-hoc interfaces. For maximum utility, the ability to create, evolve, and deploy dynamic storage interfaces is needed. However, in large-scale clusters, dynamic interface instantiation will require system-level support that ensures interface version consistency among storage nodes and client applications. We propose that storage systems should provide services that fully manage the life-cycle of dynamic interfaces that are aligned with the common branch-and-merge form of software maintenance, including isolated development workspaces that can be combined into existing production views of the system.},
+ address = {Aachen, Germany},
+ author = {Noah Watkins and Carlos Maltzahn and Scott Brandt and Ian Pye and Adam Manzanares},
+ booktitle = {BigDataCloud '13 (in conjunction with EuroPar 2013)},
+ date-added = {2013-07-21 00:37:45 +0000},
+ date-modified = {2020-05-10 19:29:48 -0700},
+ keywords = {papers, datamodel, scripting, storage, systems, software-defined, programmable},
+ month = {August 26},
+ title = {In-Vivo Storage System Development},
+ year = {2013},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAtLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1cvd2F0a2lucy1iZG1jMTMucGRmTxEBagAAAAABagACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////EndhdGtpbnMtYmRtYzEzLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////OUjOtAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFXAAACADcvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOlc6d2F0a2lucy1iZG1jMTMucGRmAAAOACYAEgB3AGEAdABrAGkAbgBzAC0AYgBkAG0AYwAxADMALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADVVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9XL3dhdGtpbnMtYmRtYzEzLnBkZgAAEwABLwAAFQACABH//wAAAAgADQAaACQAVAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHC},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA0Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1cvd2F0a2lucy1iZG1jMTMtc2xpZGVzLnBkZk8RAYQAAAAAAYQAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xl3YXRraW5zLWJkbWMxMy1zbGlkZXMucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////zxhuMgAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABVwAAAgA+LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpXOndhdGtpbnMtYmRtYzEzLXNsaWRlcy5wZGYADgA0ABkAdwBhAHQAawBpAG4AcwAtAGIAZABtAGMAMQAzAC0AcwBsAGkAZABlAHMALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADxVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9XL3dhdGtpbnMtYmRtYzEzLXNsaWRlcy5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQAWwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHj}}
+
+@inproceedings{buck:sc13,
+ abstract = {The MapReduce framework is being extended for domains quite different from the web applications for which it was designed, including the processing of big structured data, e.g., scientific and financial data. Previous work using MapReduce to process scientific data ignores existing structure when assigning intermediate data and scheduling tasks. In this paper, we present a method for incorporating knowledge of the structure of scientific data and executing query into the MapReduce communication model. Built in SciHadoop, a version of the Hadoop MapReduce framework for scientific data, SIDR intelligently partitions and routes intermediate data, allowing it to: remove Hadoop's global barrier and execute Reduce tasks prior to all Map tasks completing; minimize intermediate key skew; and produce early, correct results. SIDR executes queries up to 2.5 times faster than Hadoop and 37% faster than SciHadoop; produces initial results with only 6\% of the query completed; and produces dense, contiguous output.},
+ address = {Denver, CO},
+ author = {Joe Buck and Noah Watkins and Greg Levin and Adam Crume and Kleoni Ioannidou and Scott Brandt and Carlos Maltzahn and Neoklis Polyzotis and Aaron Torres},
+ booktitle = {SC '13},
+ date-added = {2013-07-21 00:28:59 +0000},
+ date-modified = {2020-01-04 23:20:15 -0700},
+ keywords = {papers, mapreduce, structured, datamanagement, routing, hpc},
+ month = {November},
+ title = {SIDR: Structure-Aware Intelligent Data Routing in Hadoop},
+ year = {2013},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAoLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0IvYnVjay1zYzEzLnBkZk8RAVQAAAAAAVQAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////w1idWNrLXNjMTMucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2NZkTwAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABQgAAAgAyLzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpCOmJ1Y2stc2MxMy5wZGYADgAcAA0AYgB1AGMAawAtAHMAYwAxADMALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADBVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9CL2J1Y2stc2MxMy5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQATwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGn}}
+
+@techreport{skourtis:ucsctr13a,
+ address = {Santa Cruz, CA},
+ author = {Dimitris Skourtis and Noah Watkins and Dimitris Achlioptas and Carlos Maltzahn and Scott Brandt},
+ date-added = {2013-07-18 18:38:34 +0000},
+ date-modified = {2013-07-19 05:52:07 +0000},
+ institution = {UCSC},
+ keywords = {papers, flash, cluster, redundancy, performance, management, qos, parallel, filesystems},
+ month = {July 18},
+ number = {UCSC-SOE-13-10},
+ title = {Latency Minimization in {SSD} Clusters for Free},
+ type = {Tech. rept.},
+ year = {2013},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAxLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1Mvc2tvdXJ0aXMtdWNzY3RyMTNhLnBkZk8RAXoAAAAAAXoAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xZza291cnRpcy11Y3NjdHIxM2EucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////zg2E6wAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABUwAAAgA7LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpTOnNrb3VydGlzLXVjc2N0cjEzYS5wZGYAAA4ALgAWAHMAawBvAHUAcgB0AGkAcwAtAHUAYwBzAGMAdAByADEAMwBhAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA5VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvUy9za291cnRpcy11Y3NjdHIxM2EucGRmAAATAAEvAAAVAAIAEf//AAAACAANABoAJABYAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAdY=}}
+
+@techreport{skourtis:ucsctr13,
+ address = {Santa Cruz, CA},
+ author = {Dimitris Skourtis and Scott A. Brandt and Carlos Maltzahn},
+ date-added = {2013-07-17 23:54:42 +0000},
+ date-modified = {2013-07-17 23:58:48 +0000},
+ institution = {UCSC},
+ keywords = {papers, flash, performance, management, qos},
+ month = {May 14},
+ number = {UCSC-SOE-13-08},
+ title = {Ianus: Guaranteeing High Performance in Solid-State Drives},
+ type = {Tech. rept.},
+ year = {2013},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAwLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1Mvc2tvdXJ0aXMtdWNzY3RyMTMucGRmTxEBdAAAAAABdAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////FXNrb3VydGlzLXVjc2N0cjEzLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////ODH02AAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFTAAACADovOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOlM6c2tvdXJ0aXMtdWNzY3RyMTMucGRmAA4ALAAVAHMAawBvAHUAcgB0AGkAcwAtAHUAYwBzAGMAdAByADEAMwAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAOFVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL1Mvc2tvdXJ0aXMtdWNzY3RyMTMucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFcAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABzw==},
+ bdsk-url-1 = {http://www.soe.ucsc.edu/research/technical-reports/ucsc-soe-13-08}}
+
+@techreport{buck:ucsctr12,
+ address = {Santa Cruz, CA},
+ author = {Joe Buck and Noah Watkins and Greg Levin and Adam Crume and Kleoni Ioannidou and Scott Brandt and Carlos Maltzahn and Neoklis Polyzotis},
+ date-added = {2013-05-30 22:56:59 +0000},
+ date-modified = {2013-05-30 22:59:07 +0000},
+ institution = {University of California Santa Cruz},
+ keywords = {papers, mapreduce, hadoop, hpc, communication, networking, structured, datamanagement},
+ month = {July 26},
+ number = {UCSC-SOE-12-08},
+ title = {Structure-Aware Intelligent Data Routing in SciHadoop},
+ type = {Technical Report},
+ year = {2012},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAsLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0IvYnVjay11Y3NjdHIxMi5wZGZPEQFkAAAAAAFkAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8RYnVjay11Y3NjdHIxMi5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////83NJ9YAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUIAAAIANi86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6QjpidWNrLXVjc2N0cjEyLnBkZgAOACQAEQBiAHUAYwBrAC0AdQBjAHMAYwB0AHIAMQAyAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA0VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvQi9idWNrLXVjc2N0cjEyLnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABTAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbs=}}
+
+@techreport{crume:ucsctr12,
+ address = {Santa Cruz, CA},
+ author = {Adam Crume and Joe Buck and Noah Watkins and Carlos Maltzahn and Scott Brandt and Neoklis Polyzotis},
+ date-added = {2013-05-30 22:54:07 +0000},
+ date-modified = {2013-05-30 22:55:49 +0000},
+ institution = {University of California Santa Cruz},
+ keywords = {papers, compression, hadoop, semantic, structured, datamanagement, mapreduce},
+ month = {August 16},
+ number = {UCSC-SOE-12-13},
+ title = {SciHadoop Semantic Compression},
+ type = {Technical Report},
+ year = {2012},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAtLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0MvY3J1bWUtdWNzY3RyMTIucGRmTxEBagAAAAABagACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////EmNydW1lLXVjc2N0cjEyLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////NzSaNAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFDAAACADcvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkM6Y3J1bWUtdWNzY3RyMTIucGRmAAAOACYAEgBjAHIAdQBtAGUALQB1AGMAcwBjAHQAcgAxADIALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADVVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9DL2NydW1lLXVjc2N0cjEyLnBkZgAAEwABLwAAFQACABH//wAAAAgADQAaACQAVAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHC}}
+
+@techreport{watkins:ucsctr13,
+ abstract = {The emergence of high-performance open-source storage systems is allowing application and middleware developers to consider non-standard storage system interfaces. In contrast to the common practice of translating all I/O access onto the POSIX file interface, it will soon be common for application development to include the co-design of storage system interfaces. In order for developers to evolve a co-design in high-availability clusters, services are needed for in-vivo interface evolution that allows the development of interfaces in the context of a live system.
+Current clustered storage systems that provide interface customizability expose primitive services for managing static interfaces. For maximum utility, creating, evolving, and deploying dynamic storage interfaces is needed. However, in large-scale clusters, dynamic interface instantiation will require system-level support that ensures interface version consistency among storage nodes and clients. We propose that storage systems should provide services that fully manage the life-cycle of dynamic interfaces that are aligned with the common branch-and-merge form of software maintenance, including isolated development workspaces that can be combined into existing production views of the system.
+},
+ address = {Santa Cruz, CA},
+ author = {Noah Watkins and Carlos Maltzahn and Scott Brandt and Ian Pye and Adam Manzanares},
+ date-added = {2013-05-30 22:41:44 +0000},
+ date-modified = {2020-01-04 23:01:28 -0700},
+ institution = {University of California Santa Cruz},
+ keywords = {papers, datamodel, scripting, storage, systems, software-defined},
+ month = {March 16},
+ number = {UCSC-SOE-13-02},
+ title = {In-Vivo Storage System Development},
+ type = {Technical Report},
+ year = {2013},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAvLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1cvd2F0a2lucy11Y3NjdHIxMy5wZGZPEQFyAAAAAAFyAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8Ud2F0a2lucy11Y3NjdHIxMy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////83NI4MAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAVcAAAIAOS86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6Vzp3YXRraW5zLXVjc2N0cjEzLnBkZgAADgAqABQAdwBhAHQAawBpAG4AcwAtAHUAYwBzAGMAdAByADEAMwAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAN1VzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL1cvd2F0a2lucy11Y3NjdHIxMy5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFYAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABzA==}}
+
+@inproceedings{ionkov:msst13,
+ abstract = {Until recently most scientific applications produced data that is saved, analyzed and visualized at later time. In recent years, with the large increase in the amount of data and computational power available there is demand for applications to support data access in-situ, or close-to simulation to provide application steering, analytics and visualization. Data access patterns required for these activities are usually different than the data layout produced by the application. In most of the large HPC clusters scientific data is stored in parallel file systems instead of locally on the cluster nodes. To increase reliability, the data is replicated, usually using some of the standard RAID schemes. Parallel file server nodes usually have more processing power than they need, so it is feasible to offload some of the data intensive processing to them. DRepl project replaces the standard methods of data replication with replicas having different layouts, optimized for the most commonly used access patterns. Replicas can be complete (i.e. any other replica can be reconstructed from it), or incomplete. DRepl consists of a language to describe the dataset and the necessary data layouts and tools to create a user-space file server that provides and keeps the data consistent and up to date in all optimized layouts.},
+ address = {Long Beach, CA},
+ author = {Latchesar Ionkov and Mike Lang and Carlos Maltzahn},
+ booktitle = {MSST '13},
+ date-added = {2013-03-26 23:29:57 +0000},
+ date-modified = {2020-05-10 19:28:44 -0700},
+ keywords = {papers, redundancy, layout, hpc, storage, storagemedium, languages, programmable},
+ month = {May 6-10},
+ title = {DRepl: Optimizing Access to Application Data for Analysis and Visualization},
+ year = {2013},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAuLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0ktSi9pb25rb3YtbXNzdDEzLnBkZk8RAWoAAAAAAWoAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xFpb25rb3YtbXNzdDEzLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////zlIjbwAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAADSS1KAAACADgvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkktSjppb25rb3YtbXNzdDEzLnBkZgAOACQAEQBpAG8AbgBrAG8AdgAtAG0AcwBzAHQAMQAzAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA2VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvSS1KL2lvbmtvdi1tc3N0MTMucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFUAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABww==}}
+
+@inproceedings{he:hpdc13,
+ abstract = {The I/O bottleneck in high-performance computing is becoming worse as application data continues to grow. In this work, we explore how patterns of I/O within these applications can significantly affect the effectiveness of the underlying storage systems and how these same patterns can be utilized to improve many aspects of the I/O stack and mitigate the I/O bottleneck. We offer three main contributions in this paper. First, we develop and evaluate algorithms by which I/O patterns can be efficiently discovered and described. Second, we implement one such algorithm to reduce the metadata quantity in a virtual parallel file system by up to several orders of magnitude, thereby increasing the performance of writes and reads by up to 40 and 480 percent respectively. Third, we build a prototype file system with pattern-aware prefetching and evaluate it to show a 46 percent reduction in I/O latency. Finally, we believe that efficient pattern discovery and description, coupled with the observed predictability of complex patterns within many high-performance applications, offers significant potential to enable many additional I/O optimizations.},
+ address = {New York City, NY},
+ author = {Jun He and John Bent and Aaron Torres and Gary Grider and Garth Gibson and Carlos Maltzahn and Xian-He Sun},
+ booktitle = {HPDC '13},
+ date-added = {2013-03-26 23:25:38 +0000},
+ date-modified = {2020-01-05 05:25:00 -0700},
+ keywords = {papers, compression, plfs, indexing, checkpointing, patterndetection},
+ month = {June 17-22},
+ title = {I/O Acceleration with Pattern Detection},
+ year = {2013},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAoLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0gvaGUtaHBkYzEzLnBkZk8RAVQAAAAAAVQAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////w1oZS1ocGRjMTMucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////zZL7rQAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABSAAAAgAyLzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpIOmhlLWhwZGMxMy5wZGYADgAcAA0AaABlAC0AaABwAGQAYwAxADMALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADBVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9IL2hlLWhwZGMxMy5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQATwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGn}}
+
+@techreport{maltzahn:cutr99,
+ abstract = {The bandwidth usage due to HTTP traffic often varies considerably over the course of a day, requiring high network performance during peak periods while leaving network resources unused during off-peak periods. We show that using these extra network resources to prefetch web content during off-peak periods can significantly reduce peak bandwidth usage without compromising cache consistency. With large HTTP traffic variations it is therefore feasible to apply ``bandwidth smoothing'' to reduce the cost and the required capacity of a network infrastructure. In addition to reducing the peak network demand, bandwidth smoothing improves cache hit rates.
+We calculate the potential reduction in bandwidth for a given bandwidth usage profile, and show that a simple hueristic has poor prefetch accuracy. We then apply machine learning techniques to automatically develop prefetch strategies that have high accuracy. Our results are based on web proxy traces generated at a large corporate Internet exchange point and data collected from recent scans of popular web sites.},
+ author = {Carlos Maltzahn and Kathy Richardson and Dirk Grunwald and James Martin},
+ date-added = {2012-12-07 22:58:31 +0000},
+ date-modified = {2020-01-05 05:26:05 -0700},
+ institution = {Dept. of Computer Science, University of Colorado at Boulder},
+ keywords = {papers, prefetching, caching, machinelearning, networking, intermediary},
+ month = {January},
+ number = {CU-CS-879-99},
+ title = {A Feasibility Study of Bandwidth Smoothing on the World-Wide Web Using Machine Learning},
+ type = {Technical Report},
+ year = {1999},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAuLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL00vbWFsdHphaG4tY3V0cjk5LnBkZk8RAWwAAAAAAWwAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xNtYWx0emFobi1jdXRyOTkucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////zOe2AgAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABTQAAAgA4LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpNOm1hbHR6YWhuLWN1dHI5OS5wZGYADgAoABMAbQBhAGwAdAB6AGEAaABuAC0AYwB1AHQAcgA5ADkALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADZVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9NL21hbHR6YWhuLWN1dHI5OS5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQAVQAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHF}}
+
+@phdthesis{maltzahn:phdthesis99,
+ abstract = {The resource utilization of enterprise-level Web proxy servers is primarily dependent on network and disk I/O latencies and is highly variable due to a diurnal workload pattern with very predictable peak and off-peak periods. Often, the cost of resources depends on the purchased resource capacity instead of the actual utilization. This motivates the use of off-peak periods to perform speculative work in the hope that this work will later reduce resource utilization during peak periods. We take two approaches to improve resource utilization.
+In the first approach we reduce disk I/O by cache compaction during off-peak periods and by carefully designing the way a cache architecture utilizes operating system services such as the file system buffer cache and the virtual memory system. Evaluating our designs with workload generators on standard file systems we achieve disk I/O savings of over 70% compared to existing Web proxy server architectures.
+In the second approach we reduce peak bandwidth levels by prefetching bandwidth dur- ing off-peak periods. Our analysis reveals that 40% of the cacheable miss bandwidth is prefetch- able. We found that 99% of this prefetchable bandwidth is based on objects that the Web proxy server under study has not accessed before. However, these objects originate from servers which the Web proxy server under study has accessed before. Using machine learning techniques we are able to automatically generate prefetch strategies of high accuracy and medium coverage. A test of these prefetch strategies on real workloads achieves a peak-level reduction of up to 12%.},
+ address = {Boulder, Co},
+ author = {Carlos Maltzahn},
+ date-added = {2012-12-07 22:22:06 +0000},
+ date-modified = {2020-01-05 05:26:52 -0700},
+ keywords = {papers, prefetching, networking, intermediary, caching, performance, machinelearning},
+ school = {University of Colorado at Boulder},
+ title = {Improving Resource Utilization of Enterprise-Level World-Wide Web Proxy Servers},
+ year = {1999},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAzLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL00vbWFsdHphaG4tcGhkdGhlc2lzOTkucGRmTxEBggAAAAABggACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////GG1hbHR6YWhuLXBoZHRoZXNpczk5LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////M56zKAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFNAAACAD0vOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOk06bWFsdHphaG4tcGhkdGhlc2lzOTkucGRmAAAOADIAGABtAGEAbAB0AHoAYQBoAG4ALQBwAGgAZAB0AGgAZQBzAGkAcwA5ADkALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADtVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9NL21hbHR6YWhuLXBoZHRoZXNpczk5LnBkZgAAEwABLwAAFQACABH//wAAAAgADQAaACQAWgAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHg}}
+
+@inproceedings{watkins:pdsw12,
+ abstract = {As applications become more complex, and the level of concurrency in systems continue to rise, developers are struggling to scale complex data models on top of a traditional byte stream interface. Middleware tailored for specific data models is a common approach to dealing with these challenges, but middleware commonly reproduces scalable services already present in many distributed file systems.
+We present DataMods, an abstraction over existing services found in large-scale storage systems that allows middleware to take advantage of existing, highly tuned services. Specifically, DataMods provides an abstraction for extending storage system services in order to implement native, domain-specific data models and interfaces throughout the storage hierarchy.},
+ address = {Salt Lake City, UT},
+ author = {Noah Watkins and Carlos Maltzahn and Scott A. Brandt and Adam Manzanares},
+ booktitle = {PDSW'12},
+ date-added = {2012-11-02 06:03:40 +0000},
+ date-modified = {2020-01-05 05:27:34 -0700},
+ keywords = {papers, filesystems, programming, datamanagement},
+ month = {November 12},
+ read = {1},
+ title = {DataMods: Programmable File System Services},
+ year = {2012},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAtLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1cvd2F0a2lucy1wZHN3MTIucGRmTxEBagAAAAABagACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////EndhdGtpbnMtcGRzdzEyLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////M9RuLAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFXAAACADcvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOlc6d2F0a2lucy1wZHN3MTIucGRmAAAOACYAEgB3AGEAdABrAGkAbgBzAC0AcABkAHMAdwAxADIALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADVVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9XL3dhdGtpbnMtcGRzdzEyLnBkZgAAEwABLwAAFQACABH//wAAAAgADQAaACQAVAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHC},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA0Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1cvd2F0a2lucy1wZHN3MTItc2xpZGVzLnBkZk8RAYQAAAAAAYQAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xl3YXRraW5zLXBkc3cxMi1zbGlkZXMucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////zXd8LAAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABVwAAAgA+LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpXOndhdGtpbnMtcGRzdzEyLXNsaWRlcy5wZGYADgA0ABkAdwBhAHQAawBpAG4AcwAtAHAAZABzAHcAMQAyAC0AcwBsAGkAZABlAHMALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADxVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9XL3dhdGtpbnMtcGRzdzEyLXNsaWRlcy5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQAWwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHj}}
+
+@inproceedings{crume:pdsw12,
+ abstract = {In Hadoop mappers send data to reducers in the form of key/value pairs. The default design of Hadoop's process for transmitting this intermediate data can cause a very high overhead, especially for scientific data containing multiple variables in a multi-dimensional space. For example, for a 3D scalar field of a variable ``windspeed1'' the size of keys was 6.75 times the size of values. Much of the disk and network bandwidth of ``shuffling'' this intermediate data is consumed by repeatedly transmitting the variable name for each value. This significant waste of resources is due to an assumption fundamental to Hadoop's design that all key/values are independent. This assumption is inadequate for scientific data which is often organized in regular grids, a structure that can be described in small, constant size.
+Earlier we presented SciHadoop, a slightly modified version of Hadoop designed for processing scientific data. We reported on experiments with SciHadoop which confirm that the size of intermediate data has a significant impact on overall performance. Here we show preliminary designs of multiple lossless approaches to compressing intermediate data, one of which results in up to five orders of magnitude reduction the original key/value ratio.},
+ address = {Salt Lake City, UT},
+ author = {Adam Crume and Joe Buck and Carlos Maltzahn and Scott Brandt},
+ booktitle = {PDSW'12},
+ date-added = {2012-11-02 06:02:29 +0000},
+ date-modified = {2020-01-05 06:29:22 -0700},
+ keywords = {papers, mapreduce, compression, array},
+ month = {November 12},
+ title = {Compressing Intermediate Keys between Mappers and Reducers in SciHadoop},
+ year = {2012},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxArLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0MvY3J1bWUtcGRzdzEyLnBkZk8RAWIAAAAAAWIAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xBjcnVtZS1wZHN3MTIucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////zXd7tQAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABQwAAAgA1LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpDOmNydW1lLXBkc3cxMi5wZGYAAA4AIgAQAGMAcgB1AG0AZQAtAHAAZABzAHcAMQAyAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgAzVXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvQy9jcnVtZS1wZHN3MTIucGRmAAATAAEvAAAVAAIAEf//AAAACAANABoAJABSAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbg=},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAyLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0MvY3J1bWUtcGRzdzEyLXNsaWRlcy5wZGZPEQF8AAAAAAF8AAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8XY3J1bWUtcGRzdzEyLXNsaWRlcy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////813e/0AAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUMAAAIAPC86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6QzpjcnVtZS1wZHN3MTItc2xpZGVzLnBkZgAOADAAFwBjAHIAdQBtAGUALQBwAGQAcwB3ADEAMgAtAHMAbABpAGQAZQBzAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA6VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvQy9jcnVtZS1wZHN3MTItc2xpZGVzLnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABZAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAdk=}}
+
+@inproceedings{he:pdsw12,
+ abstract = {Checkpointing is the predominant storage driver in today's petascale supercomputers and is expected to remain as such in tomorrow's exascale supercomputers. Users typically prefer to checkpoint into a shared file yet parallel file systems often perform poorly for shared file writing. A powerful technique to address this problem is to transparently transform shared file writing into many exclusively written as is done in ADIOS and PLFS. Unfortunately, the metadata to reconstruct the fragments into the original file grows with the number of writers. As such, the current approach cannot scale to exaflop supercomputers due to the large overhead of creating and reassembling the metadata.
+In this paper, we develop and evaluate algorithms by which patterns in the PLFS metadata can be discovered and then used to replace the current metadata. Our evaluation shows that these patterns reduce the size of the metadata by several orders of magnitude, increase the performance of writes by up to 40 percent, and the performance of reads by up to 480 percent. This contribution therefore can allow current checkpointing models to survive the transition from peta- to exascale.},
+ address = {Salt Lake City, UT},
+ author = {Jun He and John Bent and Aaron Torres and Gary Grider and Garth Gibson and Carlos Maltzahn and Xian-He Sun},
+ booktitle = {PDSW'12},
+ date-added = {2012-11-02 06:00:38 +0000},
+ date-modified = {2020-01-05 05:28:43 -0700},
+ keywords = {papers, compression, indexing, plfs, patterndetection, checkpointing},
+ month = {November 12},
+ read = {1},
+ title = {Discovering Structure in Unstructured I/O},
+ year = {2012},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAoLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0gvaGUtcGRzdzEyLnBkZk8RAVQAAAAAAVQAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////w1oZS1wZHN3MTIucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////zMRRFgAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABSAAAAgAyLzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpIOmhlLXBkc3cxMi5wZGYADgAcAA0AaABlAC0AcABkAHMAdwAxADIALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADBVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9IL2hlLXBkc3cxMi5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQATwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGn},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAvLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0gvaGUtcGRzdzEyLXNsaWRlcy5wZGZPEQFyAAAAAAFyAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8UaGUtcGRzdzEyLXNsaWRlcy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////813fFwAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUgAAAIAOS86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6SDpoZS1wZHN3MTItc2xpZGVzLnBkZgAADgAqABQAaABlAC0AcABkAHMAdwAxADIALQBzAGwAaQBkAGUAcwAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAN1VzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0gvaGUtcGRzdzEyLXNsaWRlcy5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFYAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABzA==}}
+
+@techreport{watkins:soetr12,
+ abstract = {Cloud-based services have become an attractive alternative to in-house data centers because of their flexible, on-demand availability of compute and storage resources. This is also true for scientific high-performance computing (HPC) applications that are currently being run on expensive, dedicated hardware. One important challenge of HPC applications is their need to perform periodic global checkpoints of execution state to stable storage in order to recover from failures, but the checkpoint process can dominate the total run-time of HPC applications even in the failure-free case! In HPC architectures, dedicated stable storage is highly tuned for this type of workload using locality and physical layout policies, which are generally unknown in typical cloud environments. In this paper we introduce DataMods, an extended version of the Ceph file system and associated distributed object store RADOS, which are widely used in open source cloud stacks. DataMods extends object-based storage with extended services take advantage of common cloud data center node hardware configurations (i.e. CPU and local storage resources), and that can be used to construct efficient, scalable middleware services that span the entire storage stack and utilize asynchronous services for offline data management services.},
+ address = {Santa Cruz, CA},
+ author = {Noah Watkins and Carlos Maltzahn and Scott A. Brandt and Adam Manzanares},
+ date-added = {2012-07-21 11:39:45 +0000},
+ date-modified = {2020-01-05 05:29:20 -0700},
+ institution = {University of California Santa Cruz},
+ keywords = {papers, filesystems, programming, datamanagement},
+ month = {July},
+ number = {UCSC-SOE-12-07},
+ title = {DataMods: Programmable File System Services},
+ type = {Technical Report},
+ year = {2012},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAuLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1cvd2F0a2lucy1zb2V0cjEyLnBkZk8RAWwAAAAAAWwAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xN3YXRraW5zLXNvZXRyMTIucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////zC8z2QAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABVwAAAgA4LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpXOndhdGtpbnMtc29ldHIxMi5wZGYADgAoABMAdwBhAHQAawBpAG4AcwAtAHMAbwBlAHQAcgAxADIALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADZVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9XL3dhdGtpbnMtc29ldHIxMi5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQAVQAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHF}}
+
+@inproceedings{bhagwan:spe12,
+ abstract = {In healthcare, de-identification is fast becoming a service that is indispensable when medical data needs to be used for research and secondary use purposes. Currently, this process is done either manually, by human agent, or by an automated software algorithm. Both approaches have shortcomings. Here, we introduce a framework for enhancing the outcome of the current modes of executing a de-identification service. This paper presents the steps taken in conceiving and building a privacy framework and tool that improves the service of de-identification. Further, we test the usefulness and applicability of this system through a study with HIPAA-trained experts.},
+ address = {Honolulu, HI},
+ author = {Varun Bhagwan and Tyrone Grandison and Carlos Maltzahn},
+ booktitle = {IEEE 2012 Services Workshop on Security and Privacy Engineering (SPE2012)},
+ date-added = {2012-05-22 03:42:44 +0000},
+ date-modified = {2020-01-05 05:29:59 -0700},
+ keywords = {papers, privacy, humancomputation, healthcare},
+ month = {June},
+ title = {Recommendation-based De-Identification | A Practical Systems Approach towards De-identification of Unstructured Text in Healthcare},
+ year = {2012},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAsLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0IvYmhhZ3dhbi1zcGUxMi5wZGZPEQFkAAAAAAFkAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8RYmhhZ3dhbi1zcGUxMi5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////8vgW1UAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUIAAAIANi86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6QjpiaGFnd2FuLXNwZTEyLnBkZgAOACQAEQBiAGgAYQBnAHcAYQBuAC0AcwBwAGUAMQAyAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA0VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvQi9iaGFnd2FuLXNwZTEyLnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABTAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbs=}}
+
+@inproceedings{kato:usenix12,
+ abstract = {Graphics processing units (GPUs) have become a very powerful platform embracing a concept of heterogeneous many-core computing. However, application domains of GPUs are currently limited to specific systems, largely due to a lack of ``first-class'' GPU resource management for general-purpose multi-tasking systems.
+We present Gdev, a new ecosystem of GPU resource management in the operating system (OS). It allows the user space as well as the OS itself to use GPUs as first-class computing resources. Specifically, Gdev's virtual memory manager supports data swapping for excessive memory resource demands, and also provides a shared device memory functionality that allows GPU contexts to communicate with other contexts. Gdev further provides a GPU scheduling scheme to virtualize a physical GPU into multiple logical GPUs, enhancing isolation among working sets of multi-tasking systems.
+Our evaluation conducted on Linux and the NVIDIA GPU shows that the basic performance of our prototype implementation is reliable even compared to proprietary software. Further detailed experiments demonstrate that Gdev achieves a 2x speedup for an encrypted file system using the GPU in the OS. Gdev can also improve the makespan of dataflow programs by up to 49% exploiting shared device memory, while an error in the utilization of virtualized GPUs can be limited within only 7%.},
+ address = {Boston, MA},
+ author = {Shinpei Kato and Michael McThrow and Carlos Maltzahn and Scott A. Brandt},
+ booktitle = {USENIX ATC '12},
+ date-added = {2012-04-06 22:55:09 +0000},
+ date-modified = {2020-01-05 05:30:40 -0700},
+ keywords = {papers, gpgpu, kernel, linux, scheduling},
+ title = {Gdev: First-Class GPU Resource Management in the Operating System},
+ year = {2012},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAsLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0sva2F0by11c2VuaXgxMi5wZGZPEQFkAAAAAAFkAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8Ra2F0by11c2VuaXgxMi5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////85SIggAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUsAAAIANi86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6SzprYXRvLXVzZW5peDEyLnBkZgAOACQAEQBrAGEAdABvAC0AdQBzAGUAbgBpAHgAMQAyAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA0VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvSy9rYXRvLXVzZW5peDEyLnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABTAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbs=}}
+
+@inproceedings{liu:msst12,
+ abstract = {The largest-scale high-performance (HPC) systems are stretching parallel file systems to their limits in terms of aggregate bandwidth and numbers of clients. To further sustain the scalability of these file systems, researchers and HPC storage architects are exploring various storage system designs. One proposed storage system design integrates a tier of solid-state burst buffers into the storage system to absorb application I/O requests. In this paper, we simulate and explore this storage system design for use by large-scale HPC systems. First, we examine application I/O patterns on an existing large-scale HPC system to identify common burst patterns. Next, we describe enhancements to the CODES storage system simulator to enable our burst buffer simulations. These enhancements include the integration of a burst buffer model into the I/O forwarding layer of the simulator, the development of an I/O kernel description language and interpreter, the development of a suite of I/O kernels that are derived from observed I/O patterns, and fidelity improvements to the CODES models. We evaluate the I/O performance for a set of multiapplication I/O workloads and burst buffer configurations. We show that burst buffers can accelerate the application perceived throughput to the external storage system and can reduce the amount of external storage bandwidth required to meet a desired application perceived throughput goal.},
+ address = {Pacific Grove, CA},
+ author = {Ning Liu and Jason Cope and Philip Carns and Christopher Carothers and Robert Ross and Gary Grider and Adam Crume and Carlos Maltzahn},
+ booktitle = {MSST/SNAPI 2012},
+ date-added = {2012-03-14 14:37:23 +0000},
+ date-modified = {2020-01-05 05:31:12 -0700},
+ keywords = {papers, burstbuffer, simulation, hpc, distributed},
+ month = {April 16 - 20},
+ title = {On the Role of Burst Buffers in Leadership-class Storage Systems},
+ year = {2012},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxApLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0wvbGl1LW1zc3QxMi5wZGZPEQFaAAAAAAFaAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8ObGl1LW1zc3QxMi5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////8ytnO0AAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUwAAAIAMy86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6TDpsaXUtbXNzdDEyLnBkZgAADgAeAA4AbABpAHUALQBtAHMAcwB0ADEAMgAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAMVVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0wvbGl1LW1zc3QxMi5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFAAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABrg==},
+ bdsk-url-1 = {http://www.mcs.anl.gov/uploads/cels/papers/P2070-0312.pdf}}
+
+@inproceedings{crume:sc11poster,
+ address = {Seattle, WA},
+ author = {Adam Crume and Carlos Maltzahn and Jason Cope and Sam Lang and Rob Ross and Phil Carns and Chris Carothers and Ning Liu and Curtis L. Janssen and John Bent and Stephen Eidenbenz and Meghan Wingate},
+ booktitle = {Poster Session at SC 11},
+ date-added = {2012-03-01 20:39:54 +0000},
+ date-modified = {2020-01-05 05:31:34 -0700},
+ keywords = {shortpapers, machinelearning, simulation, performance},
+ month = {November 12-18},
+ title = {FLAMBES: Evolving Fast Performance Models},
+ year = {2011},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAvLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0MvY3J1bWUtc2MxMXBvc3Rlci5wZGZPEQFyAAAAAAFyAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8UY3J1bWUtc2MxMXBvc3Rlci5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9oqwsYAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUMAAAIAOS86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6QzpjcnVtZS1zYzExcG9zdGVyLnBkZgAADgAqABQAYwByAHUAbQBlAC0AcwBjADEAMQBwAG8AcwB0AGUAcgAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAN1VzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0MvY3J1bWUtc2MxMXBvc3Rlci5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFYAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABzA==}}
+
+@article{ames:peds12,
+ abstract = {File system metadata management has become a bottleneck for many data-intensive applications that rely on high-performance file systems. Part of the bottleneck is due to the limitations of an almost 50-year-old interface standard with metadata abstractions that were designed at a time when high-end file systems managed less than 100MB. Today's high-performance file systems store 7--9 orders of magnitude more data, resulting in a number of data items for which these metadata abstractions are inadequate, such as directory hierarchies unable to handle complex relationships among data. Users of file systems have attempted to work around these inadequacies by moving application-specific metadata management to relational databases to make metadata searchable. Splitting file system metadata management into two separate systems introduces inefficiencies and systems management problems. To address this problem, we propose QMDS: a file system metadata management service that integrates all file system metadata and uses a graph data model with attributes on nodes and edges. Our service uses a query language interface for file identification and attribute retrieval. We present our metadata management service design and architecture and study its performance using a text analysis benchmark application. Results from our QMDS prototype show the effectiveness of this approach. Compared to the use of a file system and relational database, the QMDS prototype shows superior performance for both ingest and query workloads.},
+ author = {Sasha Ames and Maya Gokhale and Carlos Maltzahn},
+ date-added = {2012-02-27 18:02:43 +0000},
+ date-modified = {2020-01-05 05:32:03 -0700},
+ journal = {International Journal of Parallel, Emergent and Distributed Systems},
+ keywords = {papers, metadata, management, graphs, filesystems, datamanagement},
+ number = {2},
+ title = {QMDS: a file system metadata management service supporting a graph data model-based query language},
+ volume = {27},
+ year = {2012},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAqLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0EvYW1lcy1wZWRzMTIucGRmTxEBXAAAAAABXAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////D2FtZXMtcGVkczEyLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////RbFwDAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFBAAACADQvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkE6YW1lcy1wZWRzMTIucGRmAA4AIAAPAGEAbQBlAHMALQBwAGUAZABzADEAMgAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAMlVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0EvYW1lcy1wZWRzMTIucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFEAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABsQ==}}
+
+@techreport{ionkov:lanltr11,
+ author = {Ionkov, Latchesar and Lang, Michael and Maltzahn, Carlos},
+ date-added = {2012-01-24 15:38:48 +0000},
+ date-modified = {2012-01-24 15:38:48 +0000},
+ institution = {Los Alamos National Laboratory},
+ number = {LA-UR-11-11589},
+ title = {{DRepl: Optimizing Access to Application Data for Analysis and Visualization}},
+ type = {Technical Report},
+ year = {2011}}
+
+@inproceedings{liu:ppam11,
+ abstract = {Exascale supercomputers will have the potential for billion-way parallelism. While physical implementations of these systems are currently not available, HPC system designers can develop models of exascale systems to evaluate system design points. Modeling these systems and associated subsystems is a significant challenge. In this paper, we present the Co-design of Exascale Storage System (CODES) framework for evaluating exascale storage system design points. As part of our early work with CODES, we discuss the use of the CODES framework to simulate leadership-scale storage systems in a tractable amount of time using parallel discrete-event simulation. We describe the current storage system models and protocols included with the CODES framework and demonstrate the use of CODES through simulations of an existing petascale storage system.
+},
+ address = {Torun, Poland},
+ author = {Ning Liu and Christopher Carothers and Jason Cope and Philip Carns and Robert Ross and Adam Crume and Carlos Maltzahn},
+ booktitle = {PPAM 2011},
+ date-added = {2012-01-17 01:13:05 +0000},
+ date-modified = {2020-01-05 05:32:41 -0700},
+ keywords = {papers, simulation, exascale, storage, systems, parallel, filesystems, hpc},
+ month = {September 11-14},
+ title = {Modeling a Leadership-scale Storage System},
+ year = {2011},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxApLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0wvbGl1LXBwYW0xMS5wZGZPEQFaAAAAAAFaAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8ObGl1LXBwYW0xMS5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////8s6Cx4AAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUwAAAIAMy86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6TDpsaXUtcHBhbTExLnBkZgAADgAeAA4AbABpAHUALQBwAHAAYQBtADEAMQAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAMVVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0wvbGl1LXBwYW0xMS5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFAAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABrg==}}
+
+@inproceedings{buck:sc11,
+ abstract = {Hadoop has become the de facto platform for large-scale data analysis in commercial applications, and increasingly so in scientific applications. However, Hadoop's byte stream data model causes inefficiencies when used to process scientific data that is commonly stored in highly-structured, array-based binary file formats resulting in limited scalability of Hadoop applications in science. We introduce SciHadoop, a Hadoop plugin allowing scientists to specify logical queries over array-based data models. SciHadoop executes queries as map/reduce programs defined over the logical data model. We describe the implementation of a SciHadoop prototype for NetCDF data sets and quantify the performance of five separate optimizations that address the following goals for several representative aggregate queries: reduce total data transfers, reduce remote reads, and reduce unnecessary reads. Two optimizations allow holistic aggregate queries to be evaluated opportunistically during the map phase; two additional optimizations intelligently partition input data to increase read locality, and one optimization avoids block scans by examining the data dependencies of an executing query to prune input partitions. Experiments involving a holistic function show run-time improvements of up to 8x, with drastic reductions of IO, both locally and over the network.},
+ address = {Seattle, WA},
+ author = {Joe Buck and Noah Watkins and Jeff LeFevre and Kleoni Ioannidou and Carlos Maltzahn and Neoklis Polyzotis and Scott A. Brandt},
+ booktitle = {SC '11},
+ date-added = {2011-08-02 22:58:10 +0000},
+ date-modified = {2020-01-05 05:34:48 -0700},
+ keywords = {papers, mapreduce, datamanagement, hpc, structured, netcdf},
+ month = {November},
+ read = {1},
+ title = {SciHadoop: Array-based Query Processing in Hadoop},
+ year = {2011},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAoLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0IvYnVjay1zYzExLnBkZk8RAVQAAAAAAVQAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////w1idWNrLXNjMTEucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2NARwwAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABQgAAAgAyLzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpCOmJ1Y2stc2MxMS5wZGYADgAcAA0AYgB1AGMAawAtAHMAYwAxADEALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADBVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9CL2J1Y2stc2MxMS5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQATwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGn}}
+
+@techreport{buck:tr0411,
+ author = {Joe Buck and Noah Watkins and Jeff LeFevre and Kleoni Ioannidou and Carlos Maltzahn and Neoklis Polyzotis and Scott A. Brandt},
+ date-added = {2011-05-27 00:06:15 -0700},
+ date-modified = {2011-05-27 00:15:42 -0700},
+ institution = {UCSC},
+ month = {April},
+ number = {UCSC-SOE-11-04},
+ title = {SciHadoop: Array-based Query Processing in Hadoop},
+ year = {2011},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAqLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0IvYnVjay10cjA0MTEucGRmTxEBXAAAAAABXAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////D2J1Y2stdHIwNDExLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////KEkSoAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFCAAACADQvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkI6YnVjay10cjA0MTEucGRmAA4AIAAPAGIAdQBjAGsALQB0AHIAMAA0ADEAMQAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAMlVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0IvYnVjay10cjA0MTEucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFEAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABsQ==}}
+
+@inproceedings{estolano:nsdi10,
+ address = {San Jose, CA},
+ author = {Esteban Molina-Estolano and Carlos Maltzahn and Ben Reed and Scott A. Brandt},
+ booktitle = {Poster Session at NSDI 2010},
+ date-added = {2011-05-26 23:31:27 -0700},
+ date-modified = {2020-01-05 05:36:43 -0700},
+ keywords = {shortpapers, metadata, mapreduce, ceph},
+ month = {April 28-30},
+ title = {Haceph: Scalable Metadata Management for Hadoop using Ceph},
+ year = {2010},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA5Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0UtRi9lZXN0b2xhbi1uc2RpMTAtYWJzdHJhY3QucGRmTxEBmAAAAAABmAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////HGVlc3RvbGFuLW5zZGkxMC1hYnN0cmFjdC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////aKsXRAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAANFLUYAAAIAQy86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6RS1GOmVlc3RvbGFuLW5zZGkxMC1hYnN0cmFjdC5wZGYAAA4AOgAcAGUAZQBzAHQAbwBsAGEAbgAtAG4AcwBkAGkAMQAwAC0AYQBiAHMAdAByAGEAYwB0AC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgBBVXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvRS1GL2Vlc3RvbGFuLW5zZGkxMC1hYnN0cmFjdC5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAGAAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAAB/A==},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA3Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0UtRi9lZXN0b2xhbi1uc2RpMTAtcG9zdGVyLnBkZk8RAZAAAAAAAZAAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xplZXN0b2xhbi1uc2RpMTAtcG9zdGVyLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2irF+QAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAADRS1GAAACAEEvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkUtRjplZXN0b2xhbi1uc2RpMTAtcG9zdGVyLnBkZgAADgA2ABoAZQBlAHMAdABvAGwAYQBuAC0AbgBzAGQAaQAxADAALQBwAG8AcwB0AGUAcgAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAP1VzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0UtRi9lZXN0b2xhbi1uc2RpMTAtcG9zdGVyLnBkZgAAEwABLwAAFQACABH//wAAAAgADQAaACQAXgAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHy}}
+
+@inproceedings{wacha:eurosys10,
+ address = {Paris, France},
+ author = {Rosie Wacha and Scott A. Brandt and John Bent and Carlos Maltzahn},
+ booktitle = {Poster Session and Ph.D. Workshop at EuroSys 2010},
+ date-added = {2011-05-26 23:29:21 -0700},
+ date-modified = {2020-01-05 05:37:15 -0700},
+ keywords = {shortpapers, raid, flash},
+ month = {April 13-16},
+ title = {RAID4S: Adding SSDs to RAID Arrays},
+ year = {2010},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA0Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1cvd2FjaGEtZXVyb3N5czEwcG9zdGVyLnBkZk8RAYQAAAAAAYQAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xl3YWNoYS1ldXJvc3lzMTBwb3N0ZXIucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2irFPwAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABVwAAAgA+LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpXOndhY2hhLWV1cm9zeXMxMHBvc3Rlci5wZGYADgA0ABkAdwBhAGMAaABhAC0AZQB1AHIAbwBzAHkAcwAxADAAcABvAHMAdABlAHIALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADxVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9XL3dhY2hhLWV1cm9zeXMxMHBvc3Rlci5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQAWwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHj}}
+
+@inproceedings{ames:nas11,
+ address = {Dalian, China},
+ author = {Sasha Ames and Maya B. Gokhale and Carlos Maltzahn},
+ booktitle = {NAS 2011},
+ date-added = {2011-05-26 23:15:19 -0700},
+ date-modified = {2011-05-26 23:17:11 -0700},
+ keywords = {papers, metadata, graphs, linking, filesystems},
+ month = {July 28-30},
+ title = {QMDS: A File System Metadata Management Service Supporting a Graph Data Model-based Query Language},
+ year = {2011}}
+
+@inproceedings{pineiro:rtas11,
+ abstract = {Real-time systems and applications are becoming increasingly complex and often comprise multiple communicating tasks. The management of the individual tasks is well-understood, but the interaction of communicating tasks with different timing characteristics is less well-understood. We discuss several representative inter-task communication flows via reserved memory buffers (possibly interconnected via a real-time network) and present RAD-Flows, a model for managing these interactions. We provide proofs and simulation results demonstrating the correctness and effectiveness of RAD-Flows, allowing system designers to determine the amount of memory required based upon the characteristics of the interacting tasks and to guarantee real-time operation of the system as a whole.},
+ address = {Chicago, IL},
+ author = {Roberto Pineiro and Kleoni Ioannidou and Carlos Maltzahn and Scott A. Brandt},
+ booktitle = {RTAS 2011},
+ date-added = {2010-12-15 12:11:43 -0800},
+ date-modified = {2020-01-05 05:37:41 -0700},
+ keywords = {papers, memory, realtime, qos, performance, management},
+ month = {April 11-14},
+ title = {RAD-FLOWS: Buffering for Predictable Communication},
+ year = {2011},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAtLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1AvcGluZWlyby1ydGFzMTEucGRmTxEBagAAAAABagACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////EnBpbmVpcm8tcnRhczExLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////LFTjnAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFQAAACADcvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOlA6cGluZWlyby1ydGFzMTEucGRmAAAOACYAEgBwAGkAbgBlAGkAcgBvAC0AcgB0AGEAcwAxADEALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADVVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9QL3BpbmVpcm8tcnRhczExLnBkZgAAEwABLwAAFQACABH//wAAAAgADQAaACQAVAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHC}}
+
+@article{maltzahn:login10,
+ abstract = {The Hadoop Distributed File System (HDFS) has a single metadata server that sets a hard limit on its maximum size. Ceph, a high-performance distributed file system under development since 2005 and now supported in Linux, bypasses the scaling limits of HDFS. We describe Ceph and its elements and provide instructions for installing a demonstration system that can be used with Hadoop.},
+ author = {Carlos Maltzahn and Esteban Molina-Estolano and Amandeep Khurana and Alex J. Nelson and Scott A. Brandt and Sage A. Weil},
+ date-added = {2010-09-30 15:19:48 -0700},
+ date-modified = {2020-01-05 05:43:26 -0700},
+ journal = {;login: The USENIX Magazine},
+ keywords = {papers, filesystems, parallel, hadoop, mapreduce, storage},
+ number = {4},
+ title = {Ceph as a Scalable Alternative to the Hadoop Distributed File System},
+ volume = {35},
+ year = {2010},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAvLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL00vbWFsdHphaG4tbG9naW4xMC5wZGZPEQFyAAAAAAFyAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8UbWFsdHphaG4tbG9naW4xMC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9YVSKMAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAU0AAAIAOS86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6TTptYWx0emFobi1sb2dpbjEwLnBkZgAADgAqABQAbQBhAGwAdAB6AGEAaABuAC0AbABvAGcAaQBuADEAMAAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAN1VzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL00vbWFsdHphaG4tbG9naW4xMC5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFYAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABzA==}}
+
+@inproceedings{maltzahn:fast10,
+ address = {San Jose, CA},
+ author = {Carlos Maltzahn and Michael Mateas and Jim Whitehead},
+ booktitle = {Work-in-Progress and Poster Session at FAST'10},
+ date-added = {2010-03-01 16:46:58 -0800},
+ date-modified = {2020-01-05 05:44:34 -0700},
+ keywords = {shortpapers, casual, games, ir, datamanagement, pim},
+ month = {February 24-27},
+ title = {InfoGarden: A Casual-Game Approach to Digital Archive Management},
+ year = {2010},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAxLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL00vbWFsdHphaG4tZmFzdHdpcDEwLnBkZk8RAXoAAAAAAXoAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xZtYWx0emFobi1mYXN0d2lwMTAucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2OFRhAAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABTQAAAgA7LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpNOm1hbHR6YWhuLWZhc3R3aXAxMC5wZGYAAA4ALgAWAG0AYQBsAHQAegBhAGgAbgAtAGYAYQBzAHQAdwBpAHAAMQAwAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA5VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvTS9tYWx0emFobi1mYXN0d2lwMTAucGRmAAATAAEvAAAVAAIAEf//AAAACAANABoAJABYAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAdY=},
+ bdsk-file-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA3Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL00vbWFsdHphaG4tZmFzdHdpcDEwc2xpZGVzLnBkZk8RAZIAAAAAAZIAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xxtYWx0emFobi1mYXN0d2lwMTBzbGlkZXMucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2OFR5wAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABTQAAAgBBLzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpNOm1hbHR6YWhuLWZhc3R3aXAxMHNsaWRlcy5wZGYAAA4AOgAcAG0AYQBsAHQAegBhAGgAbgAtAGYAYQBzAHQAdwBpAHAAMQAwAHMAbABpAGQAZQBzAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA/VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvTS9tYWx0emFobi1mYXN0d2lwMTBzbGlkZXMucGRmAAATAAEvAAAVAAIAEf//AAAACAANABoAJABeAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAfQ=},
+ bdsk-file-3 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA3Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL00vbWFsdHphaG4tZmFzdHdpcDEwcG9zdGVyLnBkZk8RAZIAAAAAAZIAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xxtYWx0emFobi1mYXN0d2lwMTBwb3N0ZXIucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2OFSHQAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABTQAAAgBBLzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpNOm1hbHR6YWhuLWZhc3R3aXAxMHBvc3Rlci5wZGYAAA4AOgAcAG0AYQBsAHQAegBhAGgAbgAtAGYAYQBzAHQAdwBpAHAAMQAwAHAAbwBzAHQAZQByAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA/VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvTS9tYWx0emFobi1mYXN0d2lwMTBwb3N0ZXIucGRmAAATAAEvAAAVAAIAEf//AAAACAANABoAJABeAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAfQ=},
+ bdsk-url-1 = {http://www.cs.ucsc.edu/%7Ecarlosm/Infogarden/FAST_2010_WiP_Talk.html}}
+
+@inproceedings{polte:fast10,
+ address = {San Jose, CA},
+ author = {Milo Polte and Esteban Molina-Estolano and John Bent and Scott A. Brandt and Garth A. Gibson and Maya Gokhale and Carlos Maltzahn and Meghan Wingate},
+ booktitle = {Work-in-Progress and Poster Session at FAST'10},
+ date-added = {2010-03-01 16:40:17 -0800},
+ date-modified = {2010-03-01 16:46:40 -0800},
+ month = {February 24-27},
+ title = {Enabling Scientific Application I/O on Cloud FileSystems},
+ year = {2010},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxArLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1AvcG9sdGUtZmFzdDEwLnBkZk8RAWIAAAAAAWIAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xBwb2x0ZS1mYXN0MTAucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2irGmQAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABUAAAAgA1LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpQOnBvbHRlLWZhc3QxMC5wZGYAAA4AIgAQAHAAbwBsAHQAZQAtAGYAYQBzAHQAMQAwAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgAzVXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvUC9wb2x0ZS1mYXN0MTAucGRmAAATAAEvAAAVAAIAEf//AAAACAANABoAJABSAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbg=}}
+
+@techreport{ames:tr0710,
+ author = {Sasha Ames and Maya B. Gokhale and Carlos Maltzahn},
+ date-added = {2010-02-04 09:10:55 -0800},
+ date-modified = {2010-02-04 09:13:17 -0800},
+ institution = {UCSC},
+ month = {February},
+ number = {UCSC-SOE-10-07},
+ title = {Design and Implementation of a Metadata-Rich File System},
+ year = {2010},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAuLi4vLi4vLi4vc3ZuL3Fmcy9tc3N0MTAtcWZzL1VDU0MtU09FLTEwLTA3LnBkZk8RAXIAAAAAAXIAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xJVQ1NDLVNPRS0xMC0wNy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAAKbXNzdDEwLXFmcwACADgvOlVzZXJzOmNhcmxvc21hbHQ6c3ZuOnFmczptc3N0MTAtcWZzOlVDU0MtU09FLTEwLTA3LnBkZgAOACYAEgBVAEMAUwBDAC0AUwBPAEUALQAxADAALQAwADcALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADZVc2Vycy9jYXJsb3NtYWx0L3N2bi9xZnMvbXNzdDEwLXFmcy9VQ1NDLVNPRS0xMC0wNy5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQAVQAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHL}}
+
+@inproceedings{brandt:pdsw09,
+ abstract = {File systems are the backbone of large-scale data processing for scientific applications. Motivated by the need to provide an extensible and flexible framework beyond the abstractions provided by API libraries for files to manage and analyze large-scale data, we are developing Damasc, an enhanced file system where rich data management services for scientific computing are provided as a native part of the file system.
+This paper presents our vision for Damasc, a performant file system that would allow scientists or even casual users to pose declarative queries and updates over views of underlying files that are stored in their native bytestream format. In Damasc, a configurable layer is added on top of the file system to expose the contents of files in a logical data model through which views can be defined and used for queries and updates. The logical data model and views are leveraged to optimize access to files through caching and self-organizing indexing. In addition, provenance capture and analysis to file access is also built into Damasc. We describe the salient features of our proposal and discuss how it can benefit the development of scientific code.
+},
+ address = {Portland, OR},
+ author = {Scott A. Brandt and Carlos Maltzahn and Neoklis Polyzotis and Wang-Chiew Tan},
+ booktitle = {Proceedings of the 2009 ACM Petascale Data Storage Workshop (PDSW 09)},
+ date-added = {2010-01-26 23:50:43 -0800},
+ date-modified = {2020-05-10 19:30:37 -0700},
+ keywords = {papers, datamanagement, filesystems, programmable},
+ month = {November 15},
+ title = {Fusing Data Management Services with File Systems},
+ year = {2009},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAsLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0IvYnJhbmR0LXBkc3cwOS5wZGZPEQFkAAAAAAFkAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8RYnJhbmR0LXBkc3cwOS5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////8cjNq4AAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUIAAAIANi86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6QjpicmFuZHQtcGRzdzA5LnBkZgAOACQAEQBiAHIAYQBuAGQAdAAtAHAAZABzAHcAMAA5AC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA0VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvQi9icmFuZHQtcGRzdzA5LnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABTAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbs=}}
+
+@inproceedings{estolano:fast09,
+ address = {San Francisco, CA},
+ author = {Esteban Molina-Estolano and Carlos Maltzahn and Scott A. Brandt and John Bent},
+ booktitle = {WiP at FAST '09},
+ date-added = {2010-01-13 22:52:32 -0800},
+ date-modified = {2010-01-13 23:06:07 -0800},
+ month = {February 24-27},
+ title = {Comparing the Performance of Different Parallel File system Placement Strategies},
+ year = {2009},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAwLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0UtRi9lc3RvbGFuby1mYXN0MDkucGRmTxEBcgAAAAABcgACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////E2VzdG9sYW5vLWZhc3QwOS5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////aLoFtAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAANFLUYAAAIAOi86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6RS1GOmVzdG9sYW5vLWZhc3QwOS5wZGYADgAoABMAZQBzAHQAbwBsAGEAbgBvAC0AZgBhAHMAdAAwADkALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADhVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9FLUYvZXN0b2xhbm8tZmFzdDA5LnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABXAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAc0=}}
+
+@inproceedings{estolano:pdsw09,
+ abstract = {MapReduce-tailored distributed filesystems---such as HDFS for Hadoop MapReduce---and parallel high-performance computing filesystems are tailored for considerably different workloads. The purpose of our work is to examine the performance of each filesystem when both sorts of workload run on it concurrently.
+We examine two workloads on two filesystems. For the HPC workload, we use the IOR checkpointing benchmark and the Parallel Virtual File System, Version 2 (PVFS); for Hadoop, we use an HTTP attack classifier and the CloudStore filesystem. We analyze the performance of each file system when it concurrently runs its ``native'' workload as well as the non-native workload.},
+ address = {Portland, OR},
+ author = {Esteban Molina-Estolano and Maya Gokhale and Carlos Maltzahn and John May and John Bent and Scott Brandt},
+ booktitle = {Proceedings of the 2009 ACM Petascale Data Storage Workshop (PDSW 09)},
+ date-added = {2010-01-03 23:04:09 -0800},
+ date-modified = {2020-01-05 05:51:32 -0700},
+ keywords = {papers, performance, hpc, mapreduce, filesystems},
+ month = {November 15},
+ title = {Mixing Hadoop and HPC Workloads on Parallel Filesystems},
+ year = {2009},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAwLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0UtRi9lc3RvbGFuby1wZHN3MDkucGRmTxEBcgAAAAABcgACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////E2VzdG9sYW5vLXBkc3cwOS5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////aNxuxAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAANFLUYAAAIAOi86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6RS1GOmVzdG9sYW5vLXBkc3cwOS5wZGYADgAoABMAZQBzAHQAbwBsAGEAbgBvAC0AcABkAHMAdwAwADkALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADhVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9FLUYvZXN0b2xhbm8tcGRzdzA5LnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABXAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAc0=}}
+
+@inproceedings{ames:sosp07,
+ address = {Stevenson, WA},
+ author = {Sasha Ames and Carlos Maltzahn and Ethan L. Miller},
+ booktitle = {Poster Session at the 21st Symposium on Operating Systems Principles (SOSP 2007)},
+ date-added = {2009-09-29 12:08:09 -0700},
+ date-modified = {2019-12-29 16:44:11 -0800},
+ keywords = {shortpapers, metadata, filesystems, querying},
+ month = {October},
+ title = {A File System Query Language},
+ year = {2007}}
+
+@techreport{weil:tr-ucsc-ceph06,
+ author = {Sage A. Weil and Scott A. Brandt and Ethan L. Miller and Darrell D. E. Long and Carlos Maltzahn},
+ date-added = {2009-09-29 12:08:09 -0700},
+ date-modified = {2009-12-14 11:54:46 -0800},
+ institution = {University of California, Santa Cruz},
+ local-url = {/Users/carlosmalt/Documents/Papers/weil-tr-ucsc-ceph06.pdf},
+ month = {Jan},
+ number = {SSRC-06-02},
+ title = {Ceph: A Scalable Object-based Storage System},
+ year = {2006},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAvLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1cvd2VpbC1waGR0aGVzaXMwNy5wZGZPEQFyAAAAAAFyAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8Ud2VpbC1waGR0aGVzaXMwNy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9YVLXYAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAVcAAAIAOS86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6Vzp3ZWlsLXBoZHRoZXNpczA3LnBkZgAADgAqABQAdwBlAGkAbAAtAHAAaABkAHQAaABlAHMAaQBzADAANwAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAN1VzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL1cvd2VpbC1waGR0aGVzaXMwNy5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFYAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABzA==},
+ bdsk-url-1 = {http://ceph.newdream.net/weil-thesis.pdf}}
+
+@techreport{weil:tr-ucsc-crush06,
+ author = {Sage A. Weil and Scott A. Brandt and Ethan L. Miller and Carlos Maltzahn},
+ date-added = {2009-09-29 12:08:09 -0700},
+ date-modified = {2009-12-14 11:54:46 -0800},
+ institution = {University of California, Santa Cruz},
+ local-url = {/Users/carlosmalt/Documents/Papers/weil-tr-ucsc-crush06.pdf},
+ month = {Jan},
+ number = {SSRC-06-01},
+ title = {{CRUSH}: Controlled, Scalable, Decentralized Placement of Replicated Data},
+ year = {2006},
+ bdsk-url-1 = {http://www.ssrc.ucsc.edu/Papers/weil-tr-crush06.pdf}}
+
+@inproceedings{bigelow:pdsw07,
+ abstract = {Many applications---for example, scientific simulation, real-time data acquisition, and distributed reservation systems---have I/O performance requirements, yet most large, distributed storage systems lack the ability to guarantee I/O performance. We are working on end-to-end performance management in scalable, distributed storage systems. The kinds of storage systems we are targeting include large high-performance computing (HPC) clusters, which require both large data volumes and high I/O rates, as well as large-scale general-purpose storage systems.},
+ address = {Reno, NV},
+ author = {David Bigelow and Suresh Iyer and Tim Kaldewey and Roberto Pineiro and Anna Povzner and Scott A. Brandt and Richard Golding and Theodore Wong and Carlos Maltzahn},
+ booktitle = {Proceedings of the 2007 ACM Petascale Data Storage Workshop (PDSW 07)},
+ date-added = {2009-09-29 12:08:09 -0700},
+ date-modified = {2020-01-05 05:56:32 -0700},
+ keywords = {papers, performance, management, distributed, storage, scalable},
+ title = {End-to-end Performance Management for Scalable Distributed Storage},
+ year = {2007},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAtLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0IvYmlnZWxvdy1wZHN3MDcucGRmTxEBagAAAAABagACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////EmJpZ2Vsb3ctcGRzdzA3LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////aNxxYAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFCAAACADcvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkI6YmlnZWxvdy1wZHN3MDcucGRmAAAOACYAEgBiAGkAZwBlAGwAbwB3AC0AcABkAHMAdwAwADcALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADVVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9CL2JpZ2Vsb3ctcGRzdzA3LnBkZgAAEwABLwAAFQACABH//wAAAAgADQAaACQAVAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHC}}
+
+@inproceedings{bobb:wdas06,
+ address = {San Jose, CA},
+ author = {Nikhil Bobb and Damian Eads and Mark W. Storer and Scott A. Brandt and Carlos Maltzahn and Ethan L. Miller},
+ booktitle = {Proceedings of the 7th International Workshop on Distributed Data and Structures (WDAS 2006)},
+ date-added = {2009-09-29 12:08:09 -0700},
+ date-modified = {2020-01-05 05:57:37 -0700},
+ local-url = {/Users/carlosmalt/Documents/Papers/bobb-wdas06.pdf},
+ month = {January},
+ title = {{Graffiti}: A Framework for Testing Collaborative Distributed Metadata},
+ year = {2006},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAqLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0IvYm9iYi13ZGFzMDYucGRmTxEBXAAAAAABXAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////D2JvYmItd2RhczA2LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////aKqnXAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFCAAACADQvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkI6Ym9iYi13ZGFzMDYucGRmAA4AIAAPAGIAbwBiAGIALQB3AGQAYQBzADAANgAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAMlVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0IvYm9iYi13ZGFzMDYucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFEAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABsQ==}}
+
+@inproceedings{storer:sisw05,
+ author = {Mark Storer and Kevin Greenan and Ethan L. Miller and Carlos Maltzahn},
+ booktitle = {Proceedings of the 3rd International IEEE Security in Storage Workshop},
+ date-added = {2009-09-29 12:08:09 -0700},
+ date-modified = {2009-12-14 11:54:42 -0800},
+ local-url = {/Users/carlosmalt/Documents/Papers/storer-sisw05.pdf},
+ month = dec,
+ title = {POTSHARDS: Storing Data for the Long-term Without Encryption},
+ year = 2005,
+ bdsk-url-1 = {http://www.ssrc.ucsc.edu/Papers/storer-sisw05.pdf}}
+
+@techreport{weil:tr-ucsc-rados07,
+ author = {Sage Weil and Carlos Maltzahn and Scott A. Brandt},
+ date-added = {2009-09-29 12:08:09 -0700},
+ date-modified = {2009-12-14 11:54:46 -0800},
+ institution = {University of California, Santa Cruz},
+ local-url = {/Users/carlosmalt/Documents/Papers/weil-tr-ucsc-rados07.pdf},
+ month = {Jan},
+ note = {Please notify the authors when citing this tech report in a paper for publication},
+ number = {SSRC-07-01},
+ title = {RADOS: A Reliable Autonomic Distributed Object Store},
+ year = {2007},
+ bdsk-url-1 = {http://www.ssrc.ucsc.edu/Papers/weil-tr-rados07.pdf}}
+
+@inproceedings{buck:dadc09,
+ abstract = {High-end computing is increasingly I/O bound as computations become more data-intensive, and data transport technologies struggle to keep pace with the demands of large-scale, distributed computations. One approach to avoiding unnecessary I/O is to move the processing to the data, as seen in Google's successful, but relatively specialized, MapReduce system. This paper discusses our investigation towards a general solution for enabling in-situ computation in a peta-scale storage system. We believe our work with flexible, application-specific structured storage is the key to addressing the I/O overhead caused by data partitioning across storage nodes. In order to manage competing workloads on storage nodes, our research in system performance management is leveraged. Our ultimate goal is a general framework for in-situ data-intensive processing, indexing, and searching, which we expect to provide orders of magnitude performance increases for data-intensive workloads.},
+ address = {Munich, Germany},
+ author = {Joe Buck and Noah Watkins and Carlos Maltzahn and Scott A. Brandt},
+ booktitle = {2nd International Workshop on Data-Aware Distributed Computing (in conjunction with HPDC-18)},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:01:11 -0700},
+ keywords = {papers, filesystems, programmable},
+ month = {June 9},
+ title = {Abstract Storage: Moving file format-specific abstractions into petabyte-scale storage systems},
+ year = {2009},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAqLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0IvYnVjay1kYWRjMDkucGRmTxEBXAAAAAABXAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////D2J1Y2stZGFkYzA5LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////aNx3VAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFCAAACADQvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkI6YnVjay1kYWRjMDkucGRmAA4AIAAPAGIAdQBjAGsALQBkAGEAZABjADAAOQAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAMlVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0IvYnVjay1kYWRjMDkucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFEAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABsQ==}}
+
+@inproceedings{brandt:ospert08,
+ abstract = {Real-time systems are growing in size and complexity and must often manage multiple competing tasks in environments where CPU is not the only limited shared resource. Memory, network, and other devices may also be shared and system-wide performance guarantees may require the allocation and scheduling of many diverse resources. We present our on-going work on performance management in a representative distributed real-time system---a distributed storage system with performance requirements---and discuss our integrated model for managing diverse resources to provide end-to-end performance guarantees.
+},
+ address = {Prague, Czech Republic},
+ author = {Scott A. Brandt and Carlos Maltzahn and Anna Povzner and Roberto Pineiro and Andrew Shewmaker and Tim Kaldewey},
+ booktitle = {OSPERT 2008},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:01:44 -0700},
+ keywords = {papers, storage, systems, distributed, performance, management, qos, realtime},
+ month = {July},
+ title = {An Integrated Model for Performance Management in a Distributed System},
+ year = {2008},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAuLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0IvYnJhbmR0LW9zcGVydDA4LnBkZk8RAWwAAAAAAWwAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xNicmFuZHQtb3NwZXJ0MDgucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////xtgY8AAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABQgAAAgA4LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpCOmJyYW5kdC1vc3BlcnQwOC5wZGYADgAoABMAYgByAGEAbgBkAHQALQBvAHMAcABlAHIAdAAwADgALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADZVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9CL2JyYW5kdC1vc3BlcnQwOC5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQAVQAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHF}}
+
+@article{estolano:jpcs09,
+ abstract = {Parallel file systems are gaining in popularity in high-end computing centers as well as commercial data centers. High-end computing systems are expected to scale exponentially and to pose new challenges to their storage scalability in terms of cost and power. To address these challenges scientists and file system designers will need a thorough understanding of the design space of parallel file systems. Yet there exist few systematic studies of parallel file system behavior at petabyte- and exabyte scale. An important reason is the significant cost of getting access to large-scale hardware to test parallel file systems. To contribute to this understanding we are building a parallel file system simulator that can simulate parallel file systems at very large scale. Our goal is to simulate petabyte-scale parallel file systems on a small cluster or even a single machine in reasonable time and fidelity. With this simulator, file system experts will be able to tune existing file systems for specific workloads, scientists and file system deployment engineers will be able to better communicate workload requirements, file system designers and researchers will be able to try out design alternatives and innovations at scale, and instructors will be able to study very large-scale parallel file system behavior in the class room. In this paper we describe our approach and provide preliminary results that are encouraging both in terms of fidelity and simulation scalability.},
+ author = {Esteban Molina-Estolano and Carlos Maltzahn and John Bent and Scott A. Brandt},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:02:20 -0700},
+ journal = {J. Phys.: Conf. Ser.},
+ keywords = {papers, performance, simulation, filesystems},
+ number = {012050},
+ title = {Building a Parallel File System Simulator},
+ volume = {126},
+ year = {2009},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAwLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0UtRi9lc3RvbGFuby1qcGNzMDkucGRmTxEBcgAAAAABcgACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////E2VzdG9sYW5vLWpwY3MwOS5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////aNx6YAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAANFLUYAAAIAOi86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6RS1GOmVzdG9sYW5vLWpwY3MwOS5wZGYADgAoABMAZQBzAHQAbwBsAGEAbgBvAC0AagBwAGMAcwAwADkALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADhVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9FLUYvZXN0b2xhbm8tanBjczA5LnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABXAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAc0=}}
+
+@inproceedings{weil:osdi06,
+ abstract = {provides excellent performance, reliability, and scalability. Ceph maximizes the separation between data and metadata management by replacing allocation tables with a pseudo-random data distribution function (CRUSH) designed for heterogeneous and dynamic clusters of unreliable object storage devices (OSDs). We leverage device intelligence by distributing data replication, failure detection and recovery to semi-autonomous OSDs running a specialized local object file system. A dynamic distributed metadata cluster provides extremely efficient metadata management and seamlessly adapts to a wide range of general purpose and scientific computing file system workloads. Performance measurements under a variety of workloads show that Ceph has excellent I/O performance and scalable metadata management, supporting more than 250,000 metadata operations per second.
+},
+ address = {Seattle, WA},
+ author = {Sage A. Weil and Scott A. Brandt and Ethan L. Miller and Darrell D. E. Long and Carlos Maltzahn},
+ booktitle = {OSDI'06},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:03:57 -0700},
+ keywords = {papers, parallel, filesystems, distributed, storage, systems, obsd, p2p},
+ month = {November},
+ read = {1},
+ title = {{Ceph}: A Scalable, High-Performance Distributed File System},
+ year = 2006,
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAqLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1cvd2VpbC1vc2RpMDYucGRmTxEBXAAAAAABXAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////D3dlaWwtb3NkaTA2LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////WMzi8AAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFXAAACADQvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOlc6d2VpbC1vc2RpMDYucGRmAA4AIAAPAHcAZQBpAGwALQBvAHMAZABpADAANgAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAMlVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL1cvd2VpbC1vc2RpMDYucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFEAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABsQ==}}
+
+@inproceedings{maltzahn:chi95,
+ abstract = {In a research community each research er knows only a small fraction of the vast number of tools offered in the continually changing environment of local computer networks. Since the on-line or off-line documentation for these tools poorly support people in finding the best tool for a given task, users prefer to ask colleagues. however, finding the right person to ask can be time consuming and asking questions can reveal incompetence. In this paper we present an architecture to a community sensitive help system which actively collects information about Unix tools by tapping into accounting information generated by the operating system and by interviewing users that are selected on the basis of collected information. The result is a help system that continually seeks to update itself, that contains information that is entirely based on the community's perspective on tools, and that consequently grows with the community and its dynamic environments.},
+ address = {Denver, CO},
+ author = {Carlos Maltzahn},
+ booktitle = {CHI '95},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:06:12 -0700},
+ keywords = {papers, cscw},
+ month = {May},
+ title = {Community Help: Discovering Tools and Locating Experts in a Dynamic Environment},
+ year = {1995},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAtLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL00vbWFsdHphaG4tY2hpOTUucGRmTxEBagAAAAABagACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////Em1hbHR6YWhuLWNoaTk1LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////YW57lAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFNAAACADcvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOk06bWFsdHphaG4tY2hpOTUucGRmAAAOACYAEgBtAGEAbAB0AHoAYQBoAG4ALQBjAGgAaQA5ADUALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADVVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9NL21hbHR6YWhuLWNoaTk1LnBkZgAAEwABLwAAFQACABH//wAAAAgADQAaACQAVAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHC}}
+
+@inproceedings{weil:sc06,
+ abstract = {Emerging large-scale distributed storage systems are faced with the task of distributing petabytes of data among tens or hundreds of thousands of storage devices. Such systems must evenly distribute data and workload to efficiently utilize available resources and maximize system performance, while facilitating system growth and managing hardware failures. We have developed CRUSH, a scalable pseudo-random data distribution function designed for distributed object-based storage systems that efficiently maps data objects to storage devices without relying on a central directory. Because large systems are inherently dynamic, CRUSH is designed to facilitate the addition and removal of storage while minimizing unnecessary data movement. The algorithm accommodates a wide variety of data replication and reliability mechanisms and distributes data in terms of user-defined policies that enforce separation of replicas across failure domains.},
+ address = {Tampa, FL},
+ author = {Sage A. Weil and Scott A. Brandt and Ethan L. Miller and Carlos Maltzahn},
+ booktitle = {SC '06},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:10:11 -0700},
+ keywords = {papers, hashing, parallel, filesystems, placement, related:ceph, obsd},
+ month = {November},
+ publisher = {ACM},
+ title = {{CRUSH}: Controlled, Scalable, Decentralized Placement of Replicated Data},
+ year = {2006},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAoLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1cvd2VpbC1zYzA2LnBkZk8RAVQAAAAAAVQAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////w13ZWlsLXNjMDYucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2tTq2QAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABVwAAAgAyLzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpXOndlaWwtc2MwNi5wZGYADgAcAA0AdwBlAGkAbAAtAHMAYwAwADYALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADBVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9XL3dlaWwtc2MwNi5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQATwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGn}}
+
+@unpublished{kroeger:unpublished96,
+ author = {Thomas M. Kroeger and Jeff Mogul and Carlos Maltzahn},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2009-12-14 11:55:00 -0800},
+ local-url = {/Users/carlosmalt/Documents/Papers/kroeger-unpublished96.pdf},
+ note = {ftp://ftp.digital.com/pub/DEC/traces/proxy/webtraces.v1.2.html},
+ title = {{D}igital's {W}eb Proxy Traces},
+ year = {1996}}
+
+@article{povzner:osr08,
+ abstract = {Guaranteed I/O performance is needed for a variety of applications ranging from real-time data collection to desktop multimedia to large-scale scientific simulations. Reservations on throughput, the standard measure of disk performance, fail to effectively manage disk performance due to the orders of magnitude difference between best-, average-, and worst-case response times, allowing reservation of less than 0.01% of the achievable bandwidth. We show that by reserving disk resources in terms of utilization it is possible to create a disk scheduler that supports reservation of nearly 100% of the disk resources, provides arbitrarily hard or soft guarantees depending upon application needs, and yields efficiency as good or better than best-effort disk schedulers tuned for performance. We present the architecture of our scheduler, prove the correctness of its algorithms, and provide results demonstrating its effectiveness.},
+ author = {Anna Povzner and Tim Kaldewey and Scott A. Brandt and Richard Golding and Theodore Wong and Carlos Maltzahn},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:12:06 -0700},
+ journal = {Operating Systems Review},
+ keywords = {papers, predictable, performance, storage, media, realtime},
+ month = {May},
+ number = {4},
+ pages = {13-25},
+ title = {Efficient Guaranteed Disk Request Scheduling with Fahrrad},
+ volume = {42},
+ year = {2008},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAsLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1AvcG92em5lci1vc3IwOC5wZGZPEQFkAAAAAAFkAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8RcG92em5lci1vc3IwOC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////9oqsQcAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAVAAAAIANi86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6UDpwb3Z6bmVyLW9zcjA4LnBkZgAOACQAEQBwAG8AdgB6AG4AZQByAC0AbwBzAHIAMAA4AC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA0VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvUC9wb3Z6bmVyLW9zcjA4LnBkZgATAAEvAAAVAAIAEf//AAAACAANABoAJABTAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbs=}}
+
+@inproceedings{povzner:eurosys08,
+ abstract = {Guaranteed I/O performance is needed for a variety of applications ranging from real-time data collection to desktop multimedia to large-scale scientific simulations. Reservations on throughput, the standard measure of disk performance, fail to effectively manage disk performance due to the orders of magnitude difference between best-, average-, and worst-case response times, allowing reservation of less than 0.01% of the achievable bandwidth. We show that by reserving disk resources in terms of utilization it is possible to create a disk scheduler that supports reservation of nearly 100% of the disk resources, provides arbitrarily hard or soft guarantees depending upon application needs, and yields efficiency as good or better than best-effort disk schedulers tuned for performance. We present the architecture of our scheduler, prove the correctness of its algorithms, and provide results demonstrating its effectiveness.},
+ address = {Glasgow, Scottland},
+ author = {Anna Povzner and Tim Kaldewey and Scott A. Brandt and Richard Golding and Theodore Wong and Carlos Maltzahn},
+ booktitle = {Eurosys 2008},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:12:26 -0700},
+ keywords = {papers, performance, management, storage, systems, fahrrad, rbed, realtime, qos},
+ month = {March 31 - April 4},
+ title = {Efficient Guaranteed Disk Request Scheduling with Fahrrad},
+ year = {2008},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAwLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1AvcG92em5lci1ldXJvc3lzMDgucGRmTxEBdAAAAAABdAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////FXBvdnpuZXItZXVyb3N5czA4LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////asg05AAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFQAAACADovOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOlA6cG92em5lci1ldXJvc3lzMDgucGRmAA4ALAAVAHAAbwB2AHoAbgBlAHIALQBlAHUAcgBvAHMAeQBzADAAOAAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAOFVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL1AvcG92em5lci1ldXJvc3lzMDgucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFcAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABzw==}}
+
+@article{maltzahn:ddas07,
+ abstract = {Managing storage in the face of relentless growth in the number and variety of files on storage systems creates demand for rich file system metadata as is made evident by the recent emergence of rich metadata support in many applications as well as file systems. Yet, little support exists for sharing metadata across file systems even though it is not uncommon for users to manage multiple file systems and to frequently share copies of files across devices and with other users. Encouraged by the surge in popularity for collaborative bookmarking sites that share the burden of creating metadata for online content [21] we present Graffiti, a distributed organization layer for collaboratively sharing rich metadata across heterogeneous file systems. The primary purpose of Graffiti is to provide a research and rapid prototyping platform for managing metadata across file systems and users.},
+ author = {Carlos Maltzahn and Nikhil Bobb and Mark W. Storer and Damian Eads and Scott A. Brandt and Ethan L. Miller},
+ booktitle = {Distributed Data \& Structures 7},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:13:12 -0700},
+ editor = {Thomas Schwarz},
+ journal = {Proceedings in Informatics},
+ keywords = {papers, pim, tagging, distributed, naming, linking, metadata},
+ pages = {97-111},
+ publisher = {Carleton Scientific},
+ read = {Yes},
+ title = {Graffiti: A Framework for Testing Collaborative Distributed Metadata},
+ volume = {21},
+ year = {2007},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAuLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL00vbWFsdHphaG4tZGRhczA3LnBkZk8RAWwAAAAAAWwAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xNtYWx0emFobi1kZGFzMDcucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2OF6lwAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABTQAAAgA4LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpNOm1hbHR6YWhuLWRkYXMwNy5wZGYADgAoABMAbQBhAGwAdAB6AGEAaABuAC0AZABkAGEAcwAwADcALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADZVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9NL21hbHR6YWhuLWRkYXMwNy5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQAVQAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHF}}
+
+@inproceedings{rose:caise92,
+ abstract = {Repositories provide the information system's support to layer software environments. Initially, repository technology has been dominated by object representation issues. Teams are not part of the ball game. In this paper, we propose the concept of sharing processes which supports distribution and sharing of objects and tasks by teams. Sharing processes are formally specified as classes of non-deterministic f'mite automata connected to each other by deduction rules. They are intended to coordinate object access and communication for task distribution in large development projects. In particular, we show how interactions between both sharings improve object management.},
+ address = {Manchester, UK},
+ author = {Thomas Rose and Carlos Maltzahn and Matthias Jarke},
+ booktitle = {Advanced Information Systems Engineering (CAiSE'92)},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:14:53 -0700},
+ editor = {Pericles Loucopoulos},
+ keywords = {papers, sharing, cscw, datamanagement},
+ month = {May 12--15},
+ pages = {17--32},
+ publisher = {Springer Berlin / Heidelberg},
+ series = {Lecture Notes in Computer Science},
+ title = {Integrating object and agent worlds},
+ volume = {593},
+ year = {1992},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAtLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1EtUi9yb3NlLWNhaXNlOTIucGRmTxEBaAAAAAABaAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////EHJvc2UtY2Fpc2U5Mi5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////G1X/VAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAANRLVIAAAIANy86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6US1SOnJvc2UtY2Fpc2U5Mi5wZGYAAA4AIgAQAHIAbwBzAGUALQBjAGEAaQBzAGUAOQAyAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA1VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvUS1SL3Jvc2UtY2Fpc2U5Mi5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFQAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABwA==}}
+
+@inproceedings{ames:mss06,
+ abstract = {As the number and variety of files stored and accessed by a typical user has dramatically increased, existing file system structures have begun to fail as a mechanism for managing all of the information contained in those files. Many applications---email clients, multimedia management applications, and desktop search engines are examples--- have been forced to develop their own richer metadata infrastructures. While effective, these solutions are generally non-standard, non-portable, non-sharable across applications, users or platforms, proprietary, and potentially inefficient. In the interest of providing a rich, efficient, shared file system metadata infrastructure, we have developed the Linking File System (LiFS). Taking advantage of non-volatile storage class memories, LiFS supports a wide variety of user and application metadata needs while efficiently supporting traditional file system operations.},
+ address = {College Park, MD},
+ author = {Sasha Ames and Nikhil Bobb and Kevin M. Greenan and Owen S. Hofmann and Mark W. Storer and Carlos Maltzahn and Ethan L. Miller and Scott A. Brandt},
+ booktitle = {MSST '06},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:15:24 -0700},
+ keywords = {papers, linking, systems, storage, metadata, storagemedium, related:quasar, filesystems},
+ local-url = {/Users/carlosmalt/Documents/Papers/ames-mss06.pdf},
+ month = {May},
+ organization = {IEEE},
+ title = {{LiFS}: An Attribute-Rich File System for Storage Class Memories},
+ year = {2006},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxApLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0EvYW1lcy1tc3MwNi5wZGZPEQFaAAAAAAFaAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8OYW1lcy1tc3MwNi5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////8WHtvQAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUEAAAIAMy86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6QTphbWVzLW1zczA2LnBkZgAADgAeAA4AYQBtAGUAcwAtAG0AcwBzADAANgAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAMVVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0EvYW1lcy1tc3MwNi5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFAAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABrg==}}
+
+@inproceedings{maltzahn:wcw99,
+ abstract = {The bandwidth usage due to HTTP traffic often varies considerably over the course of a day, requiring high network performance during peak periods while leaving network resources unused during off-peak periods. We show that using these extra network resources to prefetch web content during off-peak periods can significantly reduce peak bandwidth usage without compromising cache consistency. With large HTTP traffic variations it is therefore feasible to apply ``bandwidth smoothing'' to reduce the cost and the required capacity of a network infrastructure. In addition to reducing the peak network demand, bandwidth smoothing improves cache hit rates. We apply machine learning techniques to automatically develop prefetch strategies that have high accuracy. Our results are based on web proxy traces generated at a large corporate Internet exchange point and data collected from recent scans of popular web sites},
+ address = {San Diego, CA},
+ author = {Carlos Maltzahn and Kathy Richardson and Dirk Grunwald and James Martin},
+ booktitle = {4th International Web Caching Workshop (WCW'99)},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:17:30 -0700},
+ keywords = {papers, networking, intermediary, machinelearning, webcaching},
+ month = {March 31 - April 2},
+ title = {On Bandwidth Smoothing},
+ year = {1999},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAtLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL00vbWFsdHphaG4td2N3OTkucGRmTxEBagAAAAABagACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////Em1hbHR6YWhuLXdjdzk5LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////M5620AAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFNAAACADcvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOk06bWFsdHphaG4td2N3OTkucGRmAAAOACYAEgBtAGEAbAB0AHoAYQBoAG4ALQB3AGMAdwA5ADkALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADVVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9NL21hbHR6YWhuLXdjdzk5LnBkZgAAEwABLwAAFQACABH//wAAAAgADQAaACQAVAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHC}}
+
+@article{maltzahn:per97,
+ abstract = {Enterprise level web proxies relay world-wide web traffic between private networks and the Internet. They improve security, save network bandwidth, and reduce network latency. While the performance of web proxies has been analyzed based on synthetic workloads, little is known about their performance on real workloads. In this paper we present a study of two web proxies (CERN and Squid) executing real workloads on Digital's Palo Alto Gateway. We demonstrate that the simple CERN proxy architecture outperforms all but the latest version of Squid and continues to outperform cacheless configurations. For the measured load levels the Squid proxy used at least as many CPU, memory, and disk resources as CERN, in some configurations significantly more resources. At higher load levels the resource utilization requirements will cross and Squid will be the one using fewer resources. Lastly we found that cache hit rates of around 30% had very little effect on the requests service time.},
+ author = {Carlos Maltzahn and Kathy Richardson and Dirk Grunwald},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:18:29 -0700},
+ journal = {ACM SIGMETRICS Performance Evaluation Review},
+ keywords = {papers, performance, webcaching, networking, intermediary},
+ month = {June},
+ number = {1},
+ pages = {13-23},
+ title = {Performance Issues of Enterprise Level Web Proxies},
+ volume = {25},
+ year = {1997},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA0Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL00vbWFsdHphaG4tc2lnbWV0cmljczk3LnBkZk8RAYQAAAAAAYQAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xltYWx0emFobi1zaWdtZXRyaWNzOTcucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2iqrEAAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABTQAAAgA+LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpNOm1hbHR6YWhuLXNpZ21ldHJpY3M5Ny5wZGYADgA0ABkAbQBhAGwAdAB6AGEAaABuAC0AcwBpAGcAbQBlAHQAcgBpAGMAcwA5ADcALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADxVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9NL21hbHR6YWhuLXNpZ21ldHJpY3M5Ny5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQAWwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHj}}
+
+@inproceedings{maltzahn:sigmetrics97,
+ abstract = {Enterprise level web proxies relay world-wide web traffic between private networks and the Internet. They improve security, save network bandwidth, and reduce network latency. While the performance of web proxies has been analyzed based on synthetic workloads, little is known about their performance on real workloads. In this paper we present a study of two web proxies (CERN and Squid) executing real workloads on Digital's Palo Alto Gateway. We demonstrate that the simple CERN proxy architecture outperforms all but the latest version of Squid and continues to outperform cacheless configurations. For the measured load levels the Squid proxy used at least as many CPU, memory, and disk resources as CERN, in some configurations significantly more resources. At higher load levels the resource utilization requirements will cross and Squid will be the one using fewer resources. Lastly we found that cache hit rates of around 30% had very little effect on the requests service time.},
+ address = {Seattle, WA},
+ author = {Carlos Maltzahn and Kathy Richardson and Dirk Grunwald},
+ booktitle = {SIGMETRICS 1997},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:19:28 -0700},
+ keywords = {papers, performance, tracing, networking, intermediary, webcaching},
+ month = {June 15-18},
+ pages = {13--23},
+ read = {Yes},
+ title = {Performance Issues of Enterprise Level Web Proxies},
+ year = {1997},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxA0Li4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL00vbWFsdHphaG4tc2lnbWV0cmljczk3LnBkZk8RAYQAAAAAAYQAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xltYWx0emFobi1zaWdtZXRyaWNzOTcucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2iqrEAAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABTQAAAgA+LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpNOm1hbHR6YWhuLXNpZ21ldHJpY3M5Ny5wZGYADgA0ABkAbQBhAGwAdAB6AGEAaABuAC0AcwBpAGcAbQBlAHQAcgBpAGMAcwA5ADcALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADxVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9NL21hbHR6YWhuLXNpZ21ldHJpY3M5Ny5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQAWwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHj}}
+
+@inproceedings{kaldewey:fast08wip,
+ address = {San Jose, CA},
+ author = {Tim Kaldewey and Andrew Shewmaker and Richard Golding and Carlos Maltzahn and Theodore Wong and Scott A. Brandt},
+ booktitle = {Work in Progress at 6th USENIX Conference on File and Storage Technologies (FAST '08)},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2019-12-29 16:42:04 -0800},
+ keywords = {shortpapers, qos, networking, storage},
+ month = {February 26-29},
+ title = {RADoN: QoS in Storage Networks},
+ year = {2008},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAxLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0sva2FsZGV3ZXktZmFzdDA4d2lwLnBkZk8RAXoAAAAAAXoAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xZrYWxkZXdleS1mYXN0MDh3aXAucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2i6HagAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABSwAAAgA7LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpLOmthbGRld2V5LWZhc3QwOHdpcC5wZGYAAA4ALgAWAGsAYQBsAGQAZQB3AGUAeQAtAGYAYQBzAHQAMAA4AHcAaQBwAC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA5VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvSy9rYWxkZXdleS1mYXN0MDh3aXAucGRmAAATAAEvAAAVAAIAEf//AAAACAANABoAJABYAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAdY=}}
+
+@inproceedings{weil:pdsw07,
+ abstract = {Brick and object-based storage architectures have emerged as a means of improving the scalability of storage clusters. However, existing systems continue to treat storage nodes as passive devices, despite their ability to exhibit significant intelligence and autonomy. We present the design and implementation of RADOS, a reliable object storage service that can scales to many thousands of devices by leveraging the intelligence present in individual storage nodes. RADOS preserves consistent data access and strong safety semantics while allowing nodes to act semi-autonomously to self-manage replication, failure detection, and failure recovery through the use of a small cluster map. Our implementation offers excellent performance, reliability, and scalability while providing clients with the illusion of a single logical object store.},
+ address = {Reno, NV},
+ author = {Sage A. Weil and Andrew Leung and Scott A. Brandt and Carlos Maltzahn},
+ booktitle = {Proceedings of the 2007 ACM Petascale Data Storage Workshop (PDSW 07)},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:20:07 -0700},
+ keywords = {papers, obsd, distributed, storage, systems, related:x10},
+ local-url = {/Users/carlosmalt/Documents/Papers/weil-pdsw07.pdf},
+ month = {November},
+ title = {RADOS: A Fast, Scalable, and Reliable Storage Service for Petabyte-scale Storage Clusters},
+ year = {2007},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAqLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1cvd2VpbC1wZHN3MDcucGRmTxEBXAAAAAABXAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////D3dlaWwtcGRzdzA3LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////F1X98AAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFXAAACADQvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOlc6d2VpbC1wZHN3MDcucGRmAA4AIAAPAHcAZQBpAGwALQBwAGQAcwB3ADAANwAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAMlVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL1cvd2VpbC1wZHN3MDcucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFEAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABsQ==}}
+
+@inproceedings{maltzahn:usenix99,
+ abstract = {The dramatic increase of HTTP traffic on the Internet has resulted in wide-spread use of large caching proxy servers as critical Internet infrastructure components. With continued growth the demand for larger caches and higher performance proxies grows as well. The common bottleneck of large caching proxy servers is disk I/O. In this paper we evaluate ways to reduce the amount of required disk I/O. First we compare the file system interactions of two existing web proxy servers, CERN and SQUID. Then we show how design adjustments to the current SQUID cache architecture can dramatically reduce disk I/O. Our findings suggest two that strategies can significantly reduce disk I/O: (1) preserve locality of the HTTP reference stream while translating these references into cache references, and (2) use virtual memory instead of the file system for objects smaller than the system page size. The evaluated techniques reduced disk I/O by 50% to 70%.},
+ address = {Monterey, CA},
+ author = {Carlos Maltzahn and Kathy Richardson and Dirk Grunwald},
+ booktitle = {USENIX ATC '99},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:20:58 -0700},
+ keywords = {papers, networking, intermediary, storage, webcaching},
+ month = {June 6-11},
+ read = {Yes},
+ title = {Reducing the Disk I/O of Web Proxy Server Caches},
+ year = {1999},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAwLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL00vbWFsdHphaG4tdXNlbml4OTkucGRmTxEBdAAAAAABdAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////FW1hbHR6YWhuLXVzZW5peDk5LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////aKqp1AAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFNAAACADovOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOk06bWFsdHphaG4tdXNlbml4OTkucGRmAA4ALAAVAG0AYQBsAHQAegBhAGgAbgAtAHUAcwBlAG4AaQB4ADkAOQAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAOFVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL00vbWFsdHphaG4tdXNlbml4OTkucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFcAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABzw==}}
+
+@inproceedings{ames:mss05,
+ abstract = {Traditional file systems provide a weak and inadequate structure for meaningful representations of file interrelationships and other context-providing metadata. Existing designs, which store additional file-oriented metadata either in a database, on disk, or both are limited by the technologies upon which they depend. Moreover, they do not provide for user-defined relationships among files. To address these issues, we created the Linking File System (LiFS), a file system design in which files may have both arbitrary user- or application-specified attributes, and attributed links between files. In order to assure performance when accessing links and attributes, the system is designed to store metadata in non-volatile memory. This paper discusses several use cases that take advantage of this approach and describes the user-space prototype we developed to test the concepts presented.
+},
+ address = {Monterey, CA},
+ author = {Alexander Ames and Nikhil Bobb and Scott A. Brandt and Adam Hiatt and Carlos Maltzahn and Ethan L. Miller and Alisa Neeman and Deepa Tuteja},
+ booktitle = {MSST '05},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:21:32 -0700},
+ keywords = {papers, ssrc, metadata, filesystems, linking},
+ local-url = {/Users/carlosmalt/Documents/Papers/ames-mss05.pdf},
+ month = {April},
+ title = {Richer File System Metadata Using Links and Attributes},
+ year = {2005},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxApLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0EvYW1lcy1tc3MwNS5wZGZPEQFaAAAAAAFaAAIAAAxNYWNpbnRvc2ggSEQAAAAAAAAAAAAAAAAAAADg73OeQkQAAf////8OYW1lcy1tc3MwNS5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////8RqJcwAAAAAAAAAAAADAAQAAAogY3UAAAAAAAAAAAAAAAAAAUEAAAIAMy86VXNlcnM6Y2FybG9zbWFsdDpNeSBEcml2ZTpQYXBlcnM6QTphbWVzLW1zczA1LnBkZgAADgAeAA4AYQBtAGUAcwAtAG0AcwBzADAANQAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAMVVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL0EvYW1lcy1tc3MwNS5wZGYAABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFAAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABrg==}}
+
+@inproceedings{koren:pdsw07,
+ abstract = {As users interact with file systems of ever increasing size, it is becoming more difficult for them to familiarize themselves with the entire contents of the file system. In petabyte-scale systems, users must navigate a pool of billions of shared files in order to find the information they are looking for. One way to help alleviate this problem is to integrate navigation and search into a common framework.
+One such method is faceted search. This method originated within the information retrieval community, and has proved popular for navigating large repositories, such as those in e-commerce sites and digital libraries. This paper introduces faceted search and outlines several current research directions in adapting faceted search techniques to petabyte-scale file systems.},
+ address = {Reno, NV},
+ author = {Jonathan Koren and Yi Zhang and Sasha Ames and Andrew Leung and Carlos Maltzahn and Ethan L. Miller},
+ booktitle = {Proceedings of the 2007 ACM Petascale Data Storage Workshop (PDSW 07)},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:24:17 -0700},
+ keywords = {papers, ir, filesystems, metadata, facets, search},
+ month = {November},
+ title = {Searching and Navigating Petabyte Scale File Systems Based on Facets},
+ year = {2007},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxArLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0sva29yZW4tcGRzdzA3LnBkZk8RAWIAAAAAAWIAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xBrb3Jlbi1wZHN3MDcucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////2jcjVgAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABSwAAAgA1LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpLOmtvcmVuLXBkc3cwNy5wZGYAAA4AIgAQAGsAbwByAGUAbgAtAHAAZABzAHcAMAA3AC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgAzVXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvSy9rb3Jlbi1wZHN3MDcucGRmAAATAAEvAAAVAAIAEf//AAAACAANABoAJABSAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAbg=}}
+
+@article{jarke:ijicis92,
+ abstract = {Information systems support for design environments emphasizes object management and tends to neglect the growing demand for team support. Process management is often tackled by rigid technological protocols which are likely to get in the way of group productivity and quality. Group tools must be introduced in an unobtrusive way which extends current practice yet provides structure and documentation of development experiences. The concept of sharing processes allows agents to coordinate the sharing of ideas, tasks, and results by interacting protocol automata which can be dynamically adapted to situational requirements. Inconsistency is managed with equal emphasis as consistency. The sharing process approach has been implemented in a system called ConceptTalk which has been experimentally integrated with design environments for information and hypertext systems.},
+ author = {Matthias Jarke and Carlos Maltzahn and Thomas Rose},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:25:27 -0700},
+ journal = {International Journal of Intelligent and Cooperative Information Systems},
+ keywords = {papers, sharing, cscw, datamanagement},
+ number = {1},
+ pages = {145--167},
+ title = {Sharing Processes: Team Coordination in Design Repositories},
+ volume = {1},
+ year = {1992},
+ bdsk-url-1 = {https://www.worldscientific.com/doi/abs/10.1142/S0218215792000076}}
+
+@inproceedings{ellis:hicss97,
+ abstract = {Chautauqua is an exploratory workflow management system designed and implemented within the Collaboration Technology Research group (CTRG) at the University of Colorado. This system represents a tightly knit merger of workflow technology and groupware technology. Chautauqua has been in test usage at the University of Colorado since 1995. This document discusses Chautauqua - its motivation, its design, and its implementation. Our emphasis here is on its novel features, and the techniques for implementing these features.},
+ address = {Wailea, Maui, HI},
+ author = {Clarence E. Ellis and Carlos Maltzahn},
+ booktitle = {30th Hawaii International Conference on System Sciences, Information System Track},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:26:44 -0700},
+ keywords = {papers, workflow, cscw},
+ month = {January},
+ title = {The Chautauqua Workflow System},
+ year = {1997},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAuLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0UtRi9lbGxpcy1oaWNzczk3LnBkZk8RAWoAAAAAAWoAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xFlbGxpcy1oaWNzczk3LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////vNABiAAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAADRS1GAAACADgvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOkUtRjplbGxpcy1oaWNzczk3LnBkZgAOACQAEQBlAGwAbABpAHMALQBoAGkAYwBzAHMAOQA3AC4AcABkAGYADwAaAAwATQBhAGMAaQBuAHQAbwBzAGgAIABIAEQAEgA2VXNlcnMvY2FybG9zbWFsdC9NeSBEcml2ZS9QYXBlcnMvRS1GL2VsbGlzLWhpY3NzOTcucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFUAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABww==}}
+
+@misc{mowat:netapp07,
+ author = {J. Eric Mowat and Yee-Peng Wang and Carlos Maltzahn and Raghu C. Mallena},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2019-12-29 16:52:04 -0800},
+ keywords = {patents, caching, webcaching},
+ month = {July},
+ title = {United States Patent 7,249,219: Method and Apparatus to Improve Buffer Cache Hit Rate},
+ year = {2007},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAtLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL00vbW93YXQtbmV0YXBwMDcucGRmTxEBagAAAAABagACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////Em1vd2F0LW5ldGFwcDA3LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////aLopKAAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFNAAACADcvOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOk06bW93YXQtbmV0YXBwMDcucGRmAAAOACYAEgBtAG8AdwBhAHQALQBuAGUAdABhAHAAcAAwADcALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADVVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9NL21vd2F0LW5ldGFwcDA3LnBkZgAAEwABLwAAFQACABH//wAAAAgADQAaACQAVAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHC}}
+
+@inproceedings{kaldewey:rtas08,
+ abstract = {Large- and small-scale storage systems frequently serve a mixture of workloads, an increasing number of which require some form of performance guarantee. Providing guaranteed disk performance---the equivalent of a ``virtual disk''---is challenging because disk requests are non-preemptible and their execution times are stateful, partially non-deterministic, and can vary by orders of magnitude. Guaranteeing throughput, the standard measure of disk performance, requires worst-case I/O time assumptions orders of magnitude greater than average I/O times, with correspondingly low performance and poor control of the resource allocation. We show that disk time utilization--- analogous to CPU utilization in CPU scheduling and the only fully provisionable aspect of disk performance---yields greater control, more efficient use of disk resources, and better isolation between request streams than bandwidth or I/O rate when used as the basis for disk reservation and scheduling.},
+ address = {St. Louis, Missouri},
+ annote = {Springer Journal of Real-Time Systems Award for Best Student Paper},
+ author = {Tim Kaldewey and Anna Povzner and Theodore Wong and Richard Golding and Scott A. Brandt and Carlos Maltzahn},
+ booktitle = {RTAS 2008},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2020-01-05 06:27:49 -0700},
+ keywords = {papers, performance, management, storage, systems, fahrrad, rbed, qos},
+ month = {April},
+ title = {Virtualizing Disk Performance},
+ year = {2008},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAuLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL0sva2FsZGV3ZXktcnRhczA4LnBkZk8RAWwAAAAAAWwAAgAADE1hY2ludG9zaCBIRAAAAAAAAAAAAAAAAAAAAODvc55CRAAB/////xNrYWxkZXdleS1ydGFzMDgucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////xT4Q0gAAAAAAAAAAAAMABAAACiBjdQAAAAAAAAAAAAAAAAABSwAAAgA4LzpVc2VyczpjYXJsb3NtYWx0Ok15IERyaXZlOlBhcGVyczpLOmthbGRld2V5LXJ0YXMwOC5wZGYADgAoABMAawBhAGwAZABlAHcAZQB5AC0AcgB0AGEAcwAwADgALgBwAGQAZgAPABoADABNAGEAYwBpAG4AdABvAHMAaAAgAEgARAASADZVc2Vycy9jYXJsb3NtYWx0L015IERyaXZlL1BhcGVycy9LL2thbGRld2V5LXJ0YXMwOC5wZGYAEwABLwAAFQACABH//wAAAAgADQAaACQAVQAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAHF}}
+
+@inproceedings{povzner:fast08wip,
+ author = {Anna Povzner and Scott A. Brandt and Richard Golding and Theodore Wong and Carlos Maltzahn},
+ booktitle = {Work in Progress at 6th USENIX Conference on File and Storage Technologies (FAST '08)},
+ date-added = {2009-09-29 12:06:25 -0700},
+ date-modified = {2019-12-29 16:55:18 -0800},
+ keywords = {shortpapers, predictable, performance, storage},
+ title = {Virtualizing Disk Performance with Fahrrad},
+ year = {2008},
+ bdsk-file-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAwLi4vLi4vLi4vTXkgRHJpdmUvUGFwZXJzL1AvcG92em5lci1mYXN0MDh3aXAucGRmTxEBdAAAAAABdAACAAAMTWFjaW50b3NoIEhEAAAAAAAAAAAAAAAAAAAA4O9znkJEAAH/////FXBvdnpuZXItZmFzdDA4d2lwLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP/////aLoN1AAAAAAAAAAAAAwAEAAAKIGN1AAAAAAAAAAAAAAAAAAFQAAACADovOlVzZXJzOmNhcmxvc21hbHQ6TXkgRHJpdmU6UGFwZXJzOlA6cG92em5lci1mYXN0MDh3aXAucGRmAA4ALAAVAHAAbwB2AHoAbgBlAHIALQBmAGEAcwB0ADAAOAB3AGkAcAAuAHAAZABmAA8AGgAMAE0AYQBjAGkAbgB0AG8AcwBoACAASABEABIAOFVzZXJzL2Nhcmxvc21hbHQvTXkgRHJpdmUvUGFwZXJzL1AvcG92em5lci1mYXN0MDh3aXAucGRmABMAAS8AABUAAgAR//8AAAAIAA0AGgAkAFcAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABzw==}}
diff --git a/netlify.toml b/netlify.toml
index 0d0f88cd5b2..d3c9863e833 100644
--- a/netlify.toml
+++ b/netlify.toml
@@ -3,7 +3,7 @@
publish = "public"
[build.environment]
- HUGO_VERSION = "0.58.3"
+ HUGO_VERSION = "0.89.4"
HUGO_ENABLEGITINFO = "true"
[context.production.environment]
diff --git a/overwrite.bib b/overwrite.bib
new file mode 100644
index 00000000000..9c3c72f418f
--- /dev/null
+++ b/overwrite.bib
@@ -0,0 +1,508 @@
+
+@inproceedings{crume:pdsw12,
+ Abstract = {In Hadoop mappers send data to reducers in the form of key/value pairs. The default design of Hadoop's process for transmitting this intermediate data can cause a very high overhead, especially for scientific data containing multiple variables in a multi-dimensional space. For example, for a 3D scalar field of a variable ``windspeed1'' the size of keys was 6.75 times the size of values. Much of the disk and network bandwidth of ``shuffling'' this intermediate data is consumed by repeatedly transmitting the variable name for each value. This significant waste of resources is due to an assumption fundamental to Hadoop's design that all key/values are independent. This assumption is inadequate for scientific data which is often organized in regular grids, a structure that can be described in small, constant size.
+Earlier we presented SciHadoop, a slightly modified version of Hadoop designed for processing scientific data. We reported on experiments with SciHadoop which confirm that the size of intermediate data has a significant impact on overall performance. Here we show preliminary designs of multiple lossless approaches to compressing intermediate data, one of which results in up to five orders of magnitude reduction the original key/value ratio.},
+ Address = {Salt Lake City, UT},
+ Author = {Adam Crume and Joe Buck and Carlos Maltzahn and Scott Brandt},
+ Booktitle = {PDSW'12},
+ Date-Added = {2012-11-02 06:02:29 +0000},
+ Date-Modified = {2020-01-05 06:29:22 -0700},
+ Keywords = {papers, mapreduce, compression, array},
+ Month = {November 12},
+ Title = {Compressing Intermediate Keys between Mappers and Reducers in SciHadoop},
+ Year = {2012},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxASQy9jcnVtZS1wZHN3MTIucGRmTxEBYAAAAAABYAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////EGNydW1lLXBkc3cxMi5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFDAAACADgvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOkM6Y3J1bWUtcGRzdzEyLnBkZgAOACIAEABjAHIAdQBtAGUALQBwAGQAcwB3ADEAMgAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAIy9NeSBEcml2ZS9QYXBlcnMvQy9jcnVtZS1wZHN3MTIucGRmAAATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA5AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAZ0=},
+ Bdsk-File-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAZQy9jcnVtZS1wZHN3MTItc2xpZGVzLnBkZk8RAXwAAAAAAXwAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xdjcnVtZS1wZHN3MTItc2xpZGVzLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABQwAAAgA/LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpDOmNydW1lLXBkc3cxMi1zbGlkZXMucGRmAAAOADAAFwBjAHIAdQBtAGUALQBwAGQAcwB3ADEAMgAtAHMAbABpAGQAZQBzAC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAqL015IERyaXZlL1BhcGVycy9DL2NydW1lLXBkc3cxMi1zbGlkZXMucGRmABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAEAAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABwA==}}
+
+@inproceedings{he:pdsw12,
+ Abstract = {Checkpointing is the predominant storage driver in today's petascale supercomputers and is expected to remain as such in tomorrow's exascale supercomputers. Users typically prefer to checkpoint into a shared file yet parallel file systems often perform poorly for shared file writing. A powerful technique to address this problem is to transparently transform shared file writing into many exclusively written as is done in ADIOS and PLFS. Unfortunately, the metadata to reconstruct the fragments into the original file grows with the number of writers. As such, the current approach cannot scale to exaflop supercomputers due to the large overhead of creating and reassembling the metadata.
+In this paper, we develop and evaluate algorithms by which patterns in the PLFS metadata can be discovered and then used to replace the current metadata. Our evaluation shows that these patterns reduce the size of the metadata by several orders of magnitude, increase the performance of writes by up to 40 percent, and the performance of reads by up to 480 percent. This contribution therefore can allow current checkpointing models to survive the transition from peta- to exascale.},
+ Address = {Salt Lake City, UT},
+ Author = {Jun He and John Bent and Aaron Torres and Gary Grider and Garth Gibson and Carlos Maltzahn and Xian-He Sun},
+ Booktitle = {PDSW'12},
+ Date-Added = {2012-11-02 06:00:38 +0000},
+ Date-Modified = {2020-01-05 05:28:43 -0700},
+ Keywords = {papers, compression, indexing, plfs, patterndetection, checkpointing},
+ Month = {November 12},
+ Read = {1},
+ Title = {Discovering Structure in Unstructured I/O},
+ Year = {2012},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAPSC9oZS1wZHN3MTIucGRmTxEBVAAAAAABVAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////DWhlLXBkc3cxMi5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFIAAACADUvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOkg6aGUtcGRzdzEyLnBkZgAADgAcAA0AaABlAC0AcABkAHMAdwAxADIALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACAvTXkgRHJpdmUvUGFwZXJzL0gvaGUtcGRzdzEyLnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA2AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAY4=},
+ Bdsk-File-2 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAWSC9oZS1wZHN3MTItc2xpZGVzLnBkZk8RAXAAAAAAAXAAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xRoZS1wZHN3MTItc2xpZGVzLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABSAAAAgA8LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpIOmhlLXBkc3cxMi1zbGlkZXMucGRmAA4AKgAUAGgAZQAtAHAAZABzAHcAMQAyAC0AcwBsAGkAZABlAHMALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACcvTXkgRHJpdmUvUGFwZXJzL0gvaGUtcGRzdzEyLXNsaWRlcy5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAD0AAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABsQ==}}
+
+@techreport{watkins:soetr12,
+ Abstract = {Cloud-based services have become an attractive alternative to in-house data centers because of their flexible, on-demand availability of compute and storage resources. This is also true for scientific high-performance computing (HPC) applications that are currently being run on expensive, dedicated hardware. One important challenge of HPC applications is their need to perform periodic global checkpoints of execution state to stable storage in order to recover from failures, but the checkpoint process can dominate the total run-time of HPC applications even in the failure-free case! In HPC architectures, dedicated stable storage is highly tuned for this type of workload using locality and physical layout policies, which are generally unknown in typical cloud environments. In this paper we introduce DataMods, an extended version of the Ceph file system and associated distributed object store RADOS, which are widely used in open source cloud stacks. DataMods extends object-based storage with extended services take advantage of common cloud data center node hardware configurations (i.e. CPU and local storage resources), and that can be used to construct efficient, scalable middleware services that span the entire storage stack and utilize asynchronous services for offline data management services.},
+ Address = {Santa Cruz, CA},
+ Author = {Noah Watkins and Carlos Maltzahn and Scott A. Brandt and Adam Manzanares},
+ Date-Added = {2012-07-21 11:39:45 +0000},
+ Date-Modified = {2020-01-05 05:29:20 -0700},
+ Institution = {University of California Santa Cruz},
+ Keywords = {papers, filesystems, programming, datamanagement},
+ Month = {July},
+ Number = {UCSC-SOE-12-07},
+ Title = {DataMods: Programmable File System Services},
+ Type = {Technical Report},
+ Year = {2012},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAVVy93YXRraW5zLXNvZXRyMTIucGRmTxEBbAAAAAABbAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////E3dhdGtpbnMtc29ldHIxMi5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFXAAACADsvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOlc6d2F0a2lucy1zb2V0cjEyLnBkZgAADgAoABMAdwBhAHQAawBpAG4AcwAtAHMAbwBlAHQAcgAxADIALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACYvTXkgRHJpdmUvUGFwZXJzL1cvd2F0a2lucy1zb2V0cjEyLnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA8AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAaw=}}
+
+@inproceedings{bhagwan:spe12,
+ Abstract = {In healthcare, de-identification is fast becoming a service that is indispensable when medical data needs to be used for research and secondary use purposes. Currently, this process is done either manually, by human agent, or by an automated software algorithm. Both approaches have shortcomings. Here, we introduce a framework for enhancing the outcome of the current modes of executing a de-identification service. This paper presents the steps taken in conceiving and building a privacy framework and tool that improves the service of de-identification. Further, we test the usefulness and applicability of this system through a study with HIPAA-trained experts.},
+ Address = {Honolulu, HI},
+ Author = {Varun Bhagwan and Tyrone Grandison and Carlos Maltzahn},
+ Booktitle = {IEEE 2012 Services Workshop on Security and Privacy Engineering (SPE2012)},
+ Date-Added = {2012-05-22 03:42:44 +0000},
+ Date-Modified = {2020-01-05 05:29:59 -0700},
+ Keywords = {papers, privacy, humancomputation, healthcare},
+ Month = {June},
+ Title = {Recommendation-based De-Identification | A Practical Systems Approach towards De-identification of Unstructured Text in Healthcare},
+ Year = {2012},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxATQi9iaGFnd2FuLXNwZTEyLnBkZk8RAWQAAAAAAWQAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xFiaGFnd2FuLXNwZTEyLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABQgAAAgA5LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpCOmJoYWd3YW4tc3BlMTIucGRmAAAOACQAEQBiAGgAYQBnAHcAYQBuAC0AcwBwAGUAMQAyAC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAkL015IERyaXZlL1BhcGVycy9CL2JoYWd3YW4tc3BlMTIucGRmABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkADoAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABog==}}
+
+@inproceedings{kato:usenix12,
+ Abstract = {Graphics processing units (GPUs) have become a very powerful platform embracing a concept of heterogeneous many-core computing. However, application domains of GPUs are currently limited to specific systems, largely due to a lack of ``first-class'' GPU resource management for general-purpose multi-tasking systems.
+We present Gdev, a new ecosystem of GPU resource management in the operating system (OS). It allows the user space as well as the OS itself to use GPUs as first-class computing resources. Specifically, Gdev's virtual memory manager supports data swapping for excessive memory resource demands, and also provides a shared device memory functionality that allows GPU contexts to communicate with other contexts. Gdev further provides a GPU scheduling scheme to virtualize a physical GPU into multiple logical GPUs, enhancing isolation among working sets of multi-tasking systems.
+Our evaluation conducted on Linux and the NVIDIA GPU shows that the basic performance of our prototype implementation is reliable even compared to proprietary software. Further detailed experiments demonstrate that Gdev achieves a 2x speedup for an encrypted file system using the GPU in the OS. Gdev can also improve the makespan of dataflow programs by up to 49% exploiting shared device memory, while an error in the utilization of virtualized GPUs can be limited within only 7%.},
+ Address = {Boston, MA},
+ Author = {Shinpei Kato and Michael McThrow and Carlos Maltzahn and Scott A. Brandt},
+ Booktitle = {USENIX ATC '12},
+ Date-Added = {2012-04-06 22:55:09 +0000},
+ Date-Modified = {2020-01-05 05:30:40 -0700},
+ Keywords = {papers, gpgpu, kernel, linux, scheduling},
+ Title = {Gdev: First-Class GPU Resource Management in the Operating System},
+ Year = {2012},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxATSy9rYXRvLXVzZW5peDEyLnBkZk8RAWQAAAAAAWQAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xFrYXRvLXVzZW5peDEyLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABSwAAAgA5LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpLOmthdG8tdXNlbml4MTIucGRmAAAOACQAEQBrAGEAdABvAC0AdQBzAGUAbgBpAHgAMQAyAC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAkL015IERyaXZlL1BhcGVycy9LL2thdG8tdXNlbml4MTIucGRmABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkADoAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABog==}}
+
+@inproceedings{liu:msst12,
+ Abstract = {The largest-scale high-performance (HPC) systems are stretching parallel file systems to their limits in terms of aggregate bandwidth and numbers of clients. To further sustain the scalability of these file systems, researchers and HPC storage architects are exploring various storage system designs. One proposed storage system design integrates a tier of solid-state burst buffers into the storage system to absorb application I/O requests. In this paper, we simulate and explore this storage system design for use by large-scale HPC systems. First, we examine application I/O patterns on an existing large-scale HPC system to identify common burst patterns. Next, we describe enhancements to the CODES storage system simulator to enable our burst buffer simulations. These enhancements include the integration of a burst buffer model into the I/O forwarding layer of the simulator, the development of an I/O kernel description language and interpreter, the development of a suite of I/O kernels that are derived from observed I/O patterns, and fidelity improvements to the CODES models. We evaluate the I/O performance for a set of multiapplication I/O workloads and burst buffer configurations. We show that burst buffers can accelerate the application perceived throughput to the external storage system and can reduce the amount of external storage bandwidth required to meet a desired application perceived throughput goal.},
+ Address = {Pacific Grove, CA},
+ Author = {Ning Liu and Jason Cope and Philip Carns and Christopher Carothers and Robert Ross and Gary Grider and Adam Crume and Carlos Maltzahn},
+ Booktitle = {MSST/SNAPI 2012},
+ Date-Added = {2012-03-14 14:37:23 +0000},
+ Date-Modified = {2020-01-05 05:31:12 -0700},
+ Keywords = {papers, burstbuffer, simulation, hpc, distributed},
+ Month = {April 16 - 20},
+ Title = {On the Role of Burst Buffers in Leadership-class Storage Systems},
+ Year = {2012},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAQTC9saXUtbXNzdDEyLnBkZk8RAVgAAAAAAVgAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////w5saXUtbXNzdDEyLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABTAAAAgA2LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpMOmxpdS1tc3N0MTIucGRmAA4AHgAOAGwAaQB1AC0AbQBzAHMAdAAxADIALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACEvTXkgRHJpdmUvUGFwZXJzL0wvbGl1LW1zc3QxMi5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkADcAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABkw==},
+ Bdsk-Url-1 = {http://www.mcs.anl.gov/uploads/cels/papers/P2070-0312.pdf}}
+
+@article{ames:peds12,
+ Abstract = {File system metadata management has become a bottleneck for many data-intensive applications that rely on high-performance file systems. Part of the bottleneck is due to the limitations of an almost 50-year-old interface standard with metadata abstractions that were designed at a time when high-end file systems managed less than 100MB. Today's high-performance file systems store 7--9 orders of magnitude more data, resulting in a number of data items for which these metadata abstractions are inadequate, such as directory hierarchies unable to handle complex relationships among data. Users of file systems have attempted to work around these inadequacies by moving application-specific metadata management to relational databases to make metadata searchable. Splitting file system metadata management into two separate systems introduces inefficiencies and systems management problems. To address this problem, we propose QMDS: a file system metadata management service that integrates all file system metadata and uses a graph data model with attributes on nodes and edges. Our service uses a query language interface for file identification and attribute retrieval. We present our metadata management service design and architecture and study its performance using a text analysis benchmark application. Results from our QMDS prototype show the effectiveness of this approach. Compared to the use of a file system and relational database, the QMDS prototype shows superior performance for both ingest and query workloads.},
+ Author = {Sasha Ames and Maya Gokhale and Carlos Maltzahn},
+ Date-Added = {2012-02-27 18:02:43 +0000},
+ Date-Modified = {2020-01-05 05:32:03 -0700},
+ Journal = {International Journal of Parallel, Emergent and Distributed Systems},
+ Keywords = {papers, metadata, management, graphs, filesystems, datamanagement},
+ Number = {2},
+ Title = {QMDS: a file system metadata management service supporting a graph data model-based query language},
+ Volume = {27},
+ Year = {2012},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxARQS9hbWVzLXBlZHMxMi5wZGZPEQFcAAAAAAFcAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8PYW1lcy1wZWRzMTIucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAUEAAAIANy86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6QTphbWVzLXBlZHMxMi5wZGYAAA4AIAAPAGEAbQBlAHMALQBwAGUAZABzADEAMgAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAIi9NeSBEcml2ZS9QYXBlcnMvQS9hbWVzLXBlZHMxMi5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAOAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGY}}
+
+@inproceedings{liu:ppam11,
+ Abstract = {Exascale supercomputers will have the potential for billion-way parallelism. While physical implementations of these systems are currently not available, HPC system designers can develop models of exascale systems to evaluate system design points. Modeling these systems and associated subsystems is a significant challenge. In this paper, we present the Co-design of Exascale Storage System (CODES) framework for evaluating exascale storage system design points. As part of our early work with CODES, we discuss the use of the CODES framework to simulate leadership-scale storage systems in a tractable amount of time using parallel discrete-event simulation. We describe the current storage system models and protocols included with the CODES framework and demonstrate the use of CODES through simulations of an existing petascale storage system.
+},
+ Address = {Torun, Poland},
+ Author = {Ning Liu and Christopher Carothers and Jason Cope and Philip Carns and Robert Ross and Adam Crume and Carlos Maltzahn},
+ Booktitle = {PPAM 2011},
+ Date-Added = {2012-01-17 01:13:05 +0000},
+ Date-Modified = {2020-01-05 05:32:41 -0700},
+ Keywords = {papers, simulation, exascale, storage, systems, parallel, filesystems, hpc},
+ Month = {September 11-14},
+ Title = {Modeling a Leadership-scale Storage System},
+ Year = {2011},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAQTC9saXUtcHBhbTExLnBkZk8RAVgAAAAAAVgAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////w5saXUtcHBhbTExLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABTAAAAgA2LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpMOmxpdS1wcGFtMTEucGRmAA4AHgAOAGwAaQB1AC0AcABwAGEAbQAxADEALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACEvTXkgRHJpdmUvUGFwZXJzL0wvbGl1LXBwYW0xMS5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkADcAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABkw==}}
+
+@inproceedings{buck:sc11,
+ Abstract = {Hadoop has become the de facto platform for large-scale data analysis in commercial applications, and increasingly so in scientific applications. However, Hadoop's byte stream data model causes inefficiencies when used to process scientific data that is commonly stored in highly-structured, array-based binary file formats resulting in limited scalability of Hadoop applications in science. We introduce SciHadoop, a Hadoop plugin allowing scientists to specify logical queries over array-based data models. SciHadoop executes queries as map/reduce programs defined over the logical data model. We describe the implementation of a SciHadoop prototype for NetCDF data sets and quantify the performance of five separate optimizations that address the following goals for several representative aggregate queries: reduce total data transfers, reduce remote reads, and reduce unnecessary reads. Two optimizations allow holistic aggregate queries to be evaluated opportunistically during the map phase; two additional optimizations intelligently partition input data to increase read locality, and one optimization avoids block scans by examining the data dependencies of an executing query to prune input partitions. Experiments involving a holistic function show run-time improvements of up to 8x, with drastic reductions of IO, both locally and over the network.},
+ Address = {Seattle, WA},
+ Author = {Joe Buck and Noah Watkins and Jeff LeFevre and Kleoni Ioannidou and Carlos Maltzahn and Neoklis Polyzotis and Scott A. Brandt},
+ Booktitle = {SC '11},
+ Date-Added = {2011-08-02 22:58:10 +0000},
+ Date-Modified = {2020-01-05 05:34:48 -0700},
+ Keywords = {papers, mapreduce, datamanagement, hpc, structured, netcdf},
+ Month = {November},
+ Read = {1},
+ Title = {SciHadoop: Array-based Query Processing in Hadoop},
+ Year = {2011},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAPQi9idWNrLXNjMTEucGRmTxEBVAAAAAABVAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////DWJ1Y2stc2MxMS5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFCAAACADUvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOkI6YnVjay1zYzExLnBkZgAADgAcAA0AYgB1AGMAawAtAHMAYwAxADEALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACAvTXkgRHJpdmUvUGFwZXJzL0IvYnVjay1zYzExLnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA2AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAY4=}}
+
+@inproceedings{ames:nas11,
+ Address = {Dalian, China},
+ Author = {Sasha Ames and Maya B. Gokhale and Carlos Maltzahn},
+ Booktitle = {NAS 2011},
+ Date-Added = {2011-05-26 23:15:19 -0700},
+ Date-Modified = {2011-05-26 23:17:11 -0700},
+ Keywords = {papers, metadata, graphs, linking, filesystems},
+ Month = {July 28-30},
+ Title = {QMDS: A File System Metadata Management Service Supporting a Graph Data Model-based Query Language},
+ Year = {2011}}
+
+@inproceedings{pineiro:rtas11,
+ Abstract = {Real-time systems and applications are becoming increasingly complex and often comprise multiple communicating tasks. The management of the individual tasks is well-understood, but the interaction of communicating tasks with different timing characteristics is less well-understood. We discuss several representative inter-task communication flows via reserved memory buffers (possibly interconnected via a real-time network) and present RAD-Flows, a model for managing these interactions. We provide proofs and simulation results demonstrating the correctness and effectiveness of RAD-Flows, allowing system designers to determine the amount of memory required based upon the characteristics of the interacting tasks and to guarantee real-time operation of the system as a whole.},
+ Address = {Chicago, IL},
+ Author = {Roberto Pineiro and Kleoni Ioannidou and Carlos Maltzahn and Scott A. Brandt},
+ Booktitle = {RTAS 2011},
+ Date-Added = {2010-12-15 12:11:43 -0800},
+ Date-Modified = {2020-01-05 05:37:41 -0700},
+ Keywords = {papers, memory, realtime, qos, performance, management},
+ Month = {April 11-14},
+ Title = {RAD-FLOWS: Buffering for Predictable Communication},
+ Year = {2011},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAUUC9waW5laXJvLXJ0YXMxMS5wZGZPEQFoAAAAAAFoAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8ScGluZWlyby1ydGFzMTEucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAVAAAAIAOi86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6UDpwaW5laXJvLXJ0YXMxMS5wZGYADgAmABIAcABpAG4AZQBpAHIAbwAtAHIAdABhAHMAMQAxAC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAlL015IERyaXZlL1BhcGVycy9QL3BpbmVpcm8tcnRhczExLnBkZgAAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAOwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGn}}
+
+@article{maltzahn:login10,
+ Abstract = {The Hadoop Distributed File System (HDFS) has a single metadata server that sets a hard limit on its maximum size. Ceph, a high-performance distributed file system under development since 2005 and now supported in Linux, bypasses the scaling limits of HDFS. We describe Ceph and its elements and provide instructions for installing a demonstration system that can be used with Hadoop.},
+ Author = {Carlos Maltzahn and Esteban Molina-Estolano and Amandeep Khurana and Alex J. Nelson and Scott A. Brandt and Sage A. Weil},
+ Date-Added = {2010-09-30 15:19:48 -0700},
+ Date-Modified = {2020-01-05 05:43:26 -0700},
+ Journal = {;login: The USENIX Magazine},
+ Keywords = {papers, filesystems, parallel, hadoop, mapreduce, storage},
+ Number = {4},
+ Title = {Ceph as a Scalable Alternative to the Hadoop Distributed File System},
+ Volume = {35},
+ Year = {2010},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAWTS9tYWx0emFobi1sb2dpbjEwLnBkZk8RAXAAAAAAAXAAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xRtYWx0emFobi1sb2dpbjEwLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABTQAAAgA8LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpNOm1hbHR6YWhuLWxvZ2luMTAucGRmAA4AKgAUAG0AYQBsAHQAegBhAGgAbgAtAGwAbwBnAGkAbgAxADAALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACcvTXkgRHJpdmUvUGFwZXJzL00vbWFsdHphaG4tbG9naW4xMC5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAD0AAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABsQ==}}
+
+@inproceedings{brandt:pdsw09,
+ Abstract = {File systems are the backbone of large-scale data processing for scientific applications. Motivated by the need to provide an extensible and flexible framework beyond the abstractions provided by API libraries for files to manage and analyze large-scale data, we are developing Damasc, an enhanced file system where rich data management services for scientific computing are provided as a native part of the file system.
+This paper presents our vision for Damasc, a performant file system that would allow scientists or even casual users to pose declarative queries and updates over views of underlying files that are stored in their native bytestream format. In Damasc, a configurable layer is added on top of the file system to expose the contents of files in a logical data model through which views can be defined and used for queries and updates. The logical data model and views are leveraged to optimize access to files through caching and self-organizing indexing. In addition, provenance capture and analysis to file access is also built into Damasc. We describe the salient features of our proposal and discuss how it can benefit the development of scientific code.
+},
+ Address = {Portland, OR},
+ Author = {Scott A. Brandt and Carlos Maltzahn and Neoklis Polyzotis and Wang-Chiew Tan},
+ Booktitle = {Proceedings of the 2009 ACM Petascale Data Storage Workshop (PDSW 09)},
+ Date-Added = {2010-01-26 23:50:43 -0800},
+ Date-Modified = {2020-01-05 05:49:01 -0700},
+ Keywords = {papers, datamanagement, filesystems},
+ Month = {November 15},
+ Title = {Fusing Data Management Services with File Systems},
+ Year = {2009},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxATQi9icmFuZHQtcGRzdzA5LnBkZk8RAWQAAAAAAWQAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xFicmFuZHQtcGRzdzA5LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABQgAAAgA5LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpCOmJyYW5kdC1wZHN3MDkucGRmAAAOACQAEQBiAHIAYQBuAGQAdAAtAHAAZABzAHcAMAA5AC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAkL015IERyaXZlL1BhcGVycy9CL2JyYW5kdC1wZHN3MDkucGRmABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkADoAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABog==}}
+
+@inproceedings{estolano:pdsw09,
+ Abstract = {MapReduce-tailored distributed filesystems---such as HDFS for Hadoop MapReduce---and parallel high-performance computing filesystems are tailored for considerably different workloads. The purpose of our work is to examine the performance of each filesystem when both sorts of workload run on it concurrently.
+We examine two workloads on two filesystems. For the HPC workload, we use the IOR checkpointing benchmark and the Parallel Virtual File System, Version 2 (PVFS); for Hadoop, we use an HTTP attack classifier and the CloudStore filesystem. We analyze the performance of each file system when it concurrently runs its ``native'' workload as well as the non-native workload.},
+ Address = {Portland, OR},
+ Author = {Esteban Molina-Estolano and Maya Gokhale and Carlos Maltzahn and John May and John Bent and Scott Brandt},
+ Booktitle = {Proceedings of the 2009 ACM Petascale Data Storage Workshop (PDSW 09)},
+ Date-Added = {2010-01-03 23:04:09 -0800},
+ Date-Modified = {2020-01-05 05:51:32 -0700},
+ Keywords = {papers, performance, hpc, mapreduce, filesystems},
+ Month = {November 15},
+ Title = {Mixing Hadoop and HPC Workloads on Parallel Filesystems},
+ Year = {2009},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAXRS1GL2VzdG9sYW5vLXBkc3cwOS5wZGZPEQFyAAAAAAFyAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8TZXN0b2xhbm8tcGRzdzA5LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAA0UtRgAAAgA9LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpFLUY6ZXN0b2xhbm8tcGRzdzA5LnBkZgAADgAoABMAZQBzAHQAbwBsAGEAbgBvAC0AcABkAHMAdwAwADkALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACgvTXkgRHJpdmUvUGFwZXJzL0UtRi9lc3RvbGFuby1wZHN3MDkucGRmABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAD4AAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABtA==}}
+
+@inproceedings{bigelow:pdsw07,
+ Abstract = {Many applications---for example, scientific simulation, real-time data acquisition, and distributed reservation systems---have I/O performance requirements, yet most large, distributed storage systems lack the ability to guarantee I/O performance. We are working on end-to-end performance management in scalable, distributed storage systems. The kinds of storage systems we are targeting include large high-performance computing (HPC) clusters, which require both large data volumes and high I/O rates, as well as large-scale general-purpose storage systems.},
+ Address = {Reno, NV},
+ Author = {David Bigelow and Suresh Iyer and Tim Kaldewey and Roberto Pineiro and Anna Povzner and Scott A. Brandt and Richard Golding and Theodore Wong and Carlos Maltzahn},
+ Booktitle = {Proceedings of the 2007 ACM Petascale Data Storage Workshop (PDSW 07)},
+ Date-Added = {2009-09-29 12:08:09 -0700},
+ Date-Modified = {2020-01-05 05:56:32 -0700},
+ Keywords = {papers, performance, management, distributed, storage, scalable},
+ Title = {End-to-end Performance Management for Scalable Distributed Storage},
+ Year = {2007},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAUQi9iaWdlbG93LXBkc3cwNy5wZGZPEQFoAAAAAAFoAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8SYmlnZWxvdy1wZHN3MDcucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAUIAAAIAOi86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6QjpiaWdlbG93LXBkc3cwNy5wZGYADgAmABIAYgBpAGcAZQBsAG8AdwAtAHAAZABzAHcAMAA3AC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAlL015IERyaXZlL1BhcGVycy9CL2JpZ2Vsb3ctcGRzdzA3LnBkZgAAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAOwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGn}}
+
+@inproceedings{buck:dadc09,
+ Abstract = {High-end computing is increasingly I/O bound as computations become more data-intensive, and data transport technologies struggle to keep pace with the demands of large-scale, distributed computations. One approach to avoiding unnecessary I/O is to move the processing to the data, as seen in Google's successful, but relatively specialized, MapReduce system. This paper discusses our investigation towards a general solution for enabling in-situ computation in a peta-scale storage system. We believe our work with flexible, application-specific structured storage is the key to addressing the I/O overhead caused by data partitioning across storage nodes. In order to manage competing workloads on storage nodes, our research in system performance management is leveraged. Our ultimate goal is a general framework for in-situ data-intensive processing, indexing, and searching, which we expect to provide orders of magnitude performance increases for data-intensive workloads.},
+ Address = {Munich, Germany},
+ Author = {Joe Buck and Noah Watkins and Carlos Maltzahn and Scott A. Brandt},
+ Booktitle = {2nd International Workshop on Data-Aware Distributed Computing (in conjunction with HPDC-18)},
+ Date-Added = {2009-09-29 12:06:25 -0700},
+ Date-Modified = {2020-01-05 06:01:11 -0700},
+ Keywords = {papers, filesystems, programmable},
+ Month = {June 9},
+ Title = {Abstract Storage: Moving file format-specific abstractions into petabyte-scale storage systems},
+ Year = {2009},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxARQi9idWNrLWRhZGMwOS5wZGZPEQFcAAAAAAFcAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8PYnVjay1kYWRjMDkucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAUIAAAIANy86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6QjpidWNrLWRhZGMwOS5wZGYAAA4AIAAPAGIAdQBjAGsALQBkAGEAZABjADAAOQAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAIi9NeSBEcml2ZS9QYXBlcnMvQi9idWNrLWRhZGMwOS5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAOAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGY}}
+
+@inproceedings{brandt:ospert08,
+ Abstract = {Real-time systems are growing in size and complexity and must often manage multiple competing tasks in environments where CPU is not the only limited shared resource. Memory, network, and other devices may also be shared and system-wide performance guarantees may require the allocation and scheduling of many diverse resources. We present our on-going work on performance management in a representative distributed real-time system---a distributed storage system with performance requirements---and discuss our integrated model for managing diverse resources to provide end-to-end performance guarantees.
+},
+ Address = {Prague, Czech Republic},
+ Author = {Scott A. Brandt and Carlos Maltzahn and Anna Povzner and Roberto Pineiro and Andrew Shewmaker and Tim Kaldewey},
+ Booktitle = {OSPERT 2008},
+ Date-Added = {2009-09-29 12:06:25 -0700},
+ Date-Modified = {2020-01-05 06:01:44 -0700},
+ Keywords = {papers, storage, systems, distributed, performance, management, qos, realtime},
+ Month = {July},
+ Title = {An Integrated Model for Performance Management in a Distributed System},
+ Year = {2008},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAVQi9icmFuZHQtb3NwZXJ0MDgucGRmTxEBbAAAAAABbAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////E2JyYW5kdC1vc3BlcnQwOC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFCAAACADsvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOkI6YnJhbmR0LW9zcGVydDA4LnBkZgAADgAoABMAYgByAGEAbgBkAHQALQBvAHMAcABlAHIAdAAwADgALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACYvTXkgRHJpdmUvUGFwZXJzL0IvYnJhbmR0LW9zcGVydDA4LnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA8AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAaw=}}
+
+@article{estolano:jpcs09,
+ Abstract = {Parallel file systems are gaining in popularity in high-end computing centers as well as commercial data centers. High-end computing systems are expected to scale exponentially and to pose new challenges to their storage scalability in terms of cost and power. To address these challenges scientists and file system designers will need a thorough understanding of the design space of parallel file systems. Yet there exist few systematic studies of parallel file system behavior at petabyte- and exabyte scale. An important reason is the significant cost of getting access to large-scale hardware to test parallel file systems. To contribute to this understanding we are building a parallel file system simulator that can simulate parallel file systems at very large scale. Our goal is to simulate petabyte-scale parallel file systems on a small cluster or even a single machine in reasonable time and fidelity. With this simulator, file system experts will be able to tune existing file systems for specific workloads, scientists and file system deployment engineers will be able to better communicate workload requirements, file system designers and researchers will be able to try out design alternatives and innovations at scale, and instructors will be able to study very large-scale parallel file system behavior in the class room. In this paper we describe our approach and provide preliminary results that are encouraging both in terms of fidelity and simulation scalability.},
+ Author = {Esteban Molina-Estolano and Carlos Maltzahn and John Bent and Scott A. Brandt},
+ Date-Added = {2009-09-29 12:06:25 -0700},
+ Date-Modified = {2020-01-05 06:02:20 -0700},
+ Journal = {J. Phys.: Conf. Ser.},
+ Keywords = {papers, performance, simulation, filesystems},
+ Number = {012050},
+ Title = {Building a Parallel File System Simulator},
+ Volume = {126},
+ Year = {2009},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAXRS1GL2VzdG9sYW5vLWpwY3MwOS5wZGZPEQFyAAAAAAFyAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8TZXN0b2xhbm8tanBjczA5LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAA0UtRgAAAgA9LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpFLUY6ZXN0b2xhbm8tanBjczA5LnBkZgAADgAoABMAZQBzAHQAbwBsAGEAbgBvAC0AagBwAGMAcwAwADkALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACgvTXkgRHJpdmUvUGFwZXJzL0UtRi9lc3RvbGFuby1qcGNzMDkucGRmABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkAD4AAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABtA==}}
+
+@inproceedings{weil:osdi06,
+ Abstract = {provides excellent performance, reliability, and scalability. Ceph maximizes the separation between data and metadata management by replacing allocation tables with a pseudo-random data distribution function (CRUSH) designed for heterogeneous and dynamic clusters of unreliable object storage devices (OSDs). We leverage device intelligence by distributing data replication, failure detection and recovery to semi-autonomous OSDs running a specialized local object file system. A dynamic distributed metadata cluster provides extremely efficient metadata management and seamlessly adapts to a wide range of general purpose and scientific computing file system workloads. Performance measurements under a variety of workloads show that Ceph has excellent I/O performance and scalable metadata management, supporting more than 250,000 metadata operations per second.
+},
+ Address = {Seattle, WA},
+ Author = {Sage A. Weil and Scott A. Brandt and Ethan L. Miller and Darrell D. E. Long and Carlos Maltzahn},
+ Booktitle = {OSDI'06},
+ Date-Added = {2009-09-29 12:06:25 -0700},
+ Date-Modified = {2020-01-05 06:03:57 -0700},
+ Keywords = {papers, parallel, filesystems, distributed, storage, systems, obsd, p2p},
+ Month = {November},
+ Read = {1},
+ Title = {{Ceph}: A Scalable, High-Performance Distributed File System},
+ Year = 2006,
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxARVy93ZWlsLW9zZGkwNi5wZGZPEQFcAAAAAAFcAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8Pd2VpbC1vc2RpMDYucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAVcAAAIANy86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6Vzp3ZWlsLW9zZGkwNi5wZGYAAA4AIAAPAHcAZQBpAGwALQBvAHMAZABpADAANgAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAIi9NeSBEcml2ZS9QYXBlcnMvVy93ZWlsLW9zZGkwNi5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAOAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGY}}
+
+@inproceedings{maltzahn:chi95,
+ Abstract = {In a research community each research er knows only a small fraction of the vast number of tools offered in the continually changing environment of local computer networks. Since the on-line or off-line documentation for these tools poorly support people in finding the best tool for a given task, users prefer to ask colleagues. however, finding the right person to ask can be time consuming and asking questions can reveal incompetence. In this paper we present an architecture to a community sensitive help system which actively collects information about Unix tools by tapping into accounting information generated by the operating system and by interviewing users that are selected on the basis of collected information. The result is a help system that continually seeks to update itself, that contains information that is entirely based on the community's perspective on tools, and that consequently grows with the community and its dynamic environments.},
+ Address = {Denver, CO},
+ Author = {Carlos Maltzahn},
+ Booktitle = {CHI '95},
+ Date-Added = {2009-09-29 12:06:25 -0700},
+ Date-Modified = {2020-01-05 06:06:12 -0700},
+ Keywords = {papers, cscw},
+ Month = {May},
+ Title = {Community Help: Discovering Tools and Locating Experts in a Dynamic Environment},
+ Year = {1995},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAUTS9tYWx0emFobi1jaGk5NS5wZGZPEQFoAAAAAAFoAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8SbWFsdHphaG4tY2hpOTUucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAU0AAAIAOi86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6TTptYWx0emFobi1jaGk5NS5wZGYADgAmABIAbQBhAGwAdAB6AGEAaABuAC0AYwBoAGkAOQA1AC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAlL015IERyaXZlL1BhcGVycy9NL21hbHR6YWhuLWNoaTk1LnBkZgAAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAOwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGn}}
+
+@inproceedings{weil:sc06,
+ Abstract = {Emerging large-scale distributed storage systems are faced with the task of distributing petabytes of data among tens or hundreds of thousands of storage devices. Such systems must evenly distribute data and workload to efficiently utilize available resources and maximize system performance, while facilitating system growth and managing hardware failures. We have developed CRUSH, a scalable pseudo-random data distribution function designed for distributed object-based storage systems that efficiently maps data objects to storage devices without relying on a central directory. Because large systems are inherently dynamic, CRUSH is designed to facilitate the addition and removal of storage while minimizing unnecessary data movement. The algorithm accommodates a wide variety of data replication and reliability mechanisms and distributes data in terms of user-defined policies that enforce separation of replicas across failure domains.},
+ Address = {Tampa, FL},
+ Author = {Sage A. Weil and Scott A. Brandt and Ethan L. Miller and Carlos Maltzahn},
+ Booktitle = {SC '06},
+ Date-Added = {2009-09-29 12:06:25 -0700},
+ Date-Modified = {2020-01-05 06:10:11 -0700},
+ Keywords = {papers, hashing, parallel, filesystems, placement, related:ceph, obsd},
+ Month = {November},
+ Publisher = {ACM},
+ Title = {{CRUSH}: Controlled, Scalable, Decentralized Placement of Replicated Data},
+ Year = {2006},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAPVy93ZWlsLXNjMDYucGRmTxEBVAAAAAABVAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////DXdlaWwtc2MwNi5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFXAAACADUvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOlc6d2VpbC1zYzA2LnBkZgAADgAcAA0AdwBlAGkAbAAtAHMAYwAwADYALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACAvTXkgRHJpdmUvUGFwZXJzL1cvd2VpbC1zYzA2LnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA2AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAY4=}}
+
+@article{povzner:osr08,
+ Abstract = {Guaranteed I/O performance is needed for a variety of applications ranging from real-time data collection to desktop multimedia to large-scale scientific simulations. Reservations on throughput, the standard measure of disk performance, fail to effectively manage disk performance due to the orders of magnitude difference between best-, average-, and worst-case response times, allowing reservation of less than 0.01% of the achievable bandwidth. We show that by reserving disk resources in terms of utilization it is possible to create a disk scheduler that supports reservation of nearly 100% of the disk resources, provides arbitrarily hard or soft guarantees depending upon application needs, and yields efficiency as good or better than best-effort disk schedulers tuned for performance. We present the architecture of our scheduler, prove the correctness of its algorithms, and provide results demonstrating its effectiveness.},
+ Author = {Anna Povzner and Tim Kaldewey and Scott A. Brandt and Richard Golding and Theodore Wong and Carlos Maltzahn},
+ Date-Added = {2009-09-29 12:06:25 -0700},
+ Date-Modified = {2020-01-05 06:12:06 -0700},
+ Journal = {Operating Systems Review},
+ Keywords = {papers, predictable, performance, storage, media, realtime},
+ Month = {May},
+ Number = {4},
+ Pages = {13-25},
+ Title = {Efficient Guaranteed Disk Request Scheduling with Fahrrad},
+ Volume = {42},
+ Year = {2008},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxATUC9wb3Z6bmVyLW9zcjA4LnBkZk8RAWQAAAAAAWQAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////xFwb3Z6bmVyLW9zcjA4LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABUAAAAgA5LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpQOnBvdnpuZXItb3NyMDgucGRmAAAOACQAEQBwAG8AdgB6AG4AZQByAC0AbwBzAHIAMAA4AC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAkL015IERyaXZlL1BhcGVycy9QL3BvdnpuZXItb3NyMDgucGRmABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkADoAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABog==}}
+
+@inproceedings{povzner:eurosys08,
+ Abstract = {Guaranteed I/O performance is needed for a variety of applications ranging from real-time data collection to desktop multimedia to large-scale scientific simulations. Reservations on throughput, the standard measure of disk performance, fail to effectively manage disk performance due to the orders of magnitude difference between best-, average-, and worst-case response times, allowing reservation of less than 0.01% of the achievable bandwidth. We show that by reserving disk resources in terms of utilization it is possible to create a disk scheduler that supports reservation of nearly 100% of the disk resources, provides arbitrarily hard or soft guarantees depending upon application needs, and yields efficiency as good or better than best-effort disk schedulers tuned for performance. We present the architecture of our scheduler, prove the correctness of its algorithms, and provide results demonstrating its effectiveness.},
+ Address = {Glasgow, Scottland},
+ Author = {Anna Povzner and Tim Kaldewey and Scott A. Brandt and Richard Golding and Theodore Wong and Carlos Maltzahn},
+ Booktitle = {Eurosys 2008},
+ Date-Added = {2009-09-29 12:06:25 -0700},
+ Date-Modified = {2020-01-05 06:12:26 -0700},
+ Keywords = {papers, performance, management, storage, systems, fahrrad, rbed, realtime, qos},
+ Month = {March 31 - April 4},
+ Title = {Efficient Guaranteed Disk Request Scheduling with Fahrrad},
+ Year = {2008},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAXUC9wb3Z6bmVyLWV1cm9zeXMwOC5wZGZPEQF0AAAAAAF0AAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8VcG92em5lci1ldXJvc3lzMDgucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAVAAAAIAPS86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6UDpwb3Z6bmVyLWV1cm9zeXMwOC5wZGYAAA4ALAAVAHAAbwB2AHoAbgBlAHIALQBlAHUAcgBvAHMAeQBzADAAOAAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAKC9NeSBEcml2ZS9QYXBlcnMvUC9wb3Z6bmVyLWV1cm9zeXMwOC5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAPgAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAG2}}
+
+@article{maltzahn:ddas07,
+ Abstract = {Managing storage in the face of relentless growth in the number and variety of files on storage systems creates demand for rich file system metadata as is made evident by the recent emergence of rich metadata support in many applications as well as file systems. Yet, little support exists for sharing metadata across file systems even though it is not uncommon for users to manage multiple file systems and to frequently share copies of files across devices and with other users. Encouraged by the surge in popularity for collaborative bookmarking sites that share the burden of creating metadata for online content [21] we present Graffiti, a distributed organization layer for collaboratively sharing rich metadata across heterogeneous file systems. The primary purpose of Graffiti is to provide a research and rapid prototyping platform for managing metadata across file systems and users.},
+ Author = {Carlos Maltzahn and Nikhil Bobb and Mark W. Storer and Damian Eads and Scott A. Brandt and Ethan L. Miller},
+ Booktitle = {Distributed Data \& Structures 7},
+ Date-Added = {2009-09-29 12:06:25 -0700},
+ Date-Modified = {2020-01-05 06:13:12 -0700},
+ Editor = {Thomas Schwarz},
+ Journal = {Proceedings in Informatics},
+ Keywords = {papers, pim, tagging, distributed, naming, linking, metadata},
+ Pages = {97-111},
+ Publisher = {Carleton Scientific},
+ Read = {Yes},
+ Title = {Graffiti: A Framework for Testing Collaborative Distributed Metadata},
+ Volume = {21},
+ Year = {2007},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAVTS9tYWx0emFobi1kZGFzMDcucGRmTxEBbAAAAAABbAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////E21hbHR6YWhuLWRkYXMwNy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFNAAACADsvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOk06bWFsdHphaG4tZGRhczA3LnBkZgAADgAoABMAbQBhAGwAdAB6AGEAaABuAC0AZABkAGEAcwAwADcALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACYvTXkgRHJpdmUvUGFwZXJzL00vbWFsdHphaG4tZGRhczA3LnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA8AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAaw=}}
+
+@inproceedings{rose:caise92,
+ Abstract = {Repositories provide the information system's support to layer software environments. Initially, repository technology has been dominated by object representation issues. Teams are not part of the ball game. In this paper, we propose the concept of sharing processes which supports distribution and sharing of objects and tasks by teams. Sharing processes are formally specified as classes of non-deterministic f'mite automata connected to each other by deduction rules. They are intended to coordinate object access and communication for task distribution in large development projects. In particular, we show how interactions between both sharings improve object management.},
+ Address = {Manchester, UK},
+ Author = {Thomas Rose and Carlos Maltzahn and Matthias Jarke},
+ Booktitle = {Advanced Information Systems Engineering (CAiSE'92)},
+ Date-Added = {2009-09-29 12:06:25 -0700},
+ Date-Modified = {2020-01-05 06:14:53 -0700},
+ Editor = {Pericles Loucopoulos},
+ Keywords = {papers, sharing, cscw, datamanagement},
+ Month = {May 12--15},
+ Pages = {17--32},
+ Publisher = {Springer Berlin / Heidelberg},
+ Series = {Lecture Notes in Computer Science},
+ Title = {Integrating object and agent worlds},
+ Volume = {593},
+ Year = {1992},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAUUS1SL3Jvc2UtY2Fpc2U5Mi5wZGZPEQFmAAAAAAFmAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8Qcm9zZS1jYWlzZTkyLnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAA1EtUgAAAgA6LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpRLVI6cm9zZS1jYWlzZTkyLnBkZgAOACIAEAByAG8AcwBlAC0AYwBhAGkAcwBlADkAMgAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAJS9NeSBEcml2ZS9QYXBlcnMvUS1SL3Jvc2UtY2Fpc2U5Mi5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkADsAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABpQ==}}
+
+@inproceedings{ames:mss06,
+ Abstract = {As the number and variety of files stored and accessed by a typical user has dramatically increased, existing file system structures have begun to fail as a mechanism for managing all of the information contained in those files. Many applications---email clients, multimedia management applications, and desktop search engines are examples--- have been forced to develop their own richer metadata infrastructures. While effective, these solutions are generally non-standard, non-portable, non-sharable across applications, users or platforms, proprietary, and potentially inefficient. In the interest of providing a rich, efficient, shared file system metadata infrastructure, we have developed the Linking File System (LiFS). Taking advantage of non-volatile storage class memories, LiFS supports a wide variety of user and application metadata needs while efficiently supporting traditional file system operations.},
+ Address = {College Park, MD},
+ Author = {Sasha Ames and Nikhil Bobb and Kevin M. Greenan and Owen S. Hofmann and Mark W. Storer and Carlos Maltzahn and Ethan L. Miller and Scott A. Brandt},
+ Booktitle = {MSST '06},
+ Date-Added = {2009-09-29 12:06:25 -0700},
+ Date-Modified = {2020-01-05 06:15:24 -0700},
+ Keywords = {papers, linking, systems, storage, metadata, storagemedium, related:quasar, filesystems},
+ Local-Url = {/Users/carlosmalt/Documents/Papers/ames-mss06.pdf},
+ Month = {May},
+ Organization = {IEEE},
+ Title = {{LiFS}: An Attribute-Rich File System for Storage Class Memories},
+ Year = {2006},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAQQS9hbWVzLW1zczA2LnBkZk8RAVgAAAAAAVgAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////w5hbWVzLW1zczA2LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABQQAAAgA2LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpBOmFtZXMtbXNzMDYucGRmAA4AHgAOAGEAbQBlAHMALQBtAHMAcwAwADYALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACEvTXkgRHJpdmUvUGFwZXJzL0EvYW1lcy1tc3MwNi5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkADcAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABkw==}}
+
+@inproceedings{maltzahn:wcw99,
+ Abstract = {The bandwidth usage due to HTTP traffic often varies considerably over the course of a day, requiring high network performance during peak periods while leaving network resources unused during off-peak periods. We show that using these extra network resources to prefetch web content during off-peak periods can significantly reduce peak bandwidth usage without compromising cache consistency. With large HTTP traffic variations it is therefore feasible to apply ``bandwidth smoothing'' to reduce the cost and the required capacity of a network infrastructure. In addition to reducing the peak network demand, bandwidth smoothing improves cache hit rates. We apply machine learning techniques to automatically develop prefetch strategies that have high accuracy. Our results are based on web proxy traces generated at a large corporate Internet exchange point and data collected from recent scans of popular web sites},
+ Address = {San Diego, CA},
+ Author = {Carlos Maltzahn and Kathy Richardson and Dirk Grunwald and James Martin},
+ Booktitle = {4th International Web Caching Workshop (WCW'99)},
+ Date-Added = {2009-09-29 12:06:25 -0700},
+ Date-Modified = {2020-01-05 06:17:30 -0700},
+ Keywords = {papers, networking, intermediary, machinelearning, webcaching},
+ Month = {March 31 - April 2},
+ Title = {On Bandwidth Smoothing},
+ Year = {1999},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAUTS9tYWx0emFobi13Y3c5OS5wZGZPEQFoAAAAAAFoAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8SbWFsdHphaG4td2N3OTkucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAU0AAAIAOi86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6TTptYWx0emFobi13Y3c5OS5wZGYADgAmABIAbQBhAGwAdAB6AGEAaABuAC0AdwBjAHcAOQA5AC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAlL015IERyaXZlL1BhcGVycy9NL21hbHR6YWhuLXdjdzk5LnBkZgAAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAOwAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGn}}
+
+@article{maltzahn:per97,
+ Abstract = {Enterprise level web proxies relay world-wide web traffic between private networks and the Internet. They improve security, save network bandwidth, and reduce network latency. While the performance of web proxies has been analyzed based on synthetic workloads, little is known about their performance on real workloads. In this paper we present a study of two web proxies (CERN and Squid) executing real workloads on Digital's Palo Alto Gateway. We demonstrate that the simple CERN proxy architecture outperforms all but the latest version of Squid and continues to outperform cacheless configurations. For the measured load levels the Squid proxy used at least as many CPU, memory, and disk resources as CERN, in some configurations significantly more resources. At higher load levels the resource utilization requirements will cross and Squid will be the one using fewer resources. Lastly we found that cache hit rates of around 30% had very little effect on the requests service time.},
+ Author = {Carlos Maltzahn and Kathy Richardson and Dirk Grunwald},
+ Date-Added = {2009-09-29 12:06:25 -0700},
+ Date-Modified = {2020-01-05 06:18:29 -0700},
+ Journal = {ACM SIGMETRICS Performance Evaluation Review},
+ Keywords = {papers, performance, webcaching, networking, intermediary},
+ Month = {June},
+ Number = {1},
+ Pages = {13-23},
+ Title = {Performance Issues of Enterprise Level Web Proxies},
+ Volume = {25},
+ Year = {1997},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAbTS9tYWx0emFobi1zaWdtZXRyaWNzOTcucGRmTxEBhAAAAAABhAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////GW1hbHR6YWhuLXNpZ21ldHJpY3M5Ny5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFNAAACAEEvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOk06bWFsdHphaG4tc2lnbWV0cmljczk3LnBkZgAADgA0ABkAbQBhAGwAdAB6AGEAaABuAC0AcwBpAGcAbQBlAHQAcgBpAGMAcwA5ADcALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACwvTXkgRHJpdmUvUGFwZXJzL00vbWFsdHphaG4tc2lnbWV0cmljczk3LnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJABCAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAco=}}
+
+@inproceedings{maltzahn:sigmetrics97,
+ Abstract = {Enterprise level web proxies relay world-wide web traffic between private networks and the Internet. They improve security, save network bandwidth, and reduce network latency. While the performance of web proxies has been analyzed based on synthetic workloads, little is known about their performance on real workloads. In this paper we present a study of two web proxies (CERN and Squid) executing real workloads on Digital's Palo Alto Gateway. We demonstrate that the simple CERN proxy architecture outperforms all but the latest version of Squid and continues to outperform cacheless configurations. For the measured load levels the Squid proxy used at least as many CPU, memory, and disk resources as CERN, in some configurations significantly more resources. At higher load levels the resource utilization requirements will cross and Squid will be the one using fewer resources. Lastly we found that cache hit rates of around 30% had very little effect on the requests service time.},
+ Address = {Seattle, WA},
+ Author = {Carlos Maltzahn and Kathy Richardson and Dirk Grunwald},
+ Booktitle = {SIGMETRICS 1997},
+ Date-Added = {2009-09-29 12:06:25 -0700},
+ Date-Modified = {2020-01-05 06:19:28 -0700},
+ Keywords = {papers, performance, tracing, networking, intermediary, webcaching},
+ Month = {June 15-18},
+ Pages = {13--23},
+ Read = {Yes},
+ Title = {Performance Issues of Enterprise Level Web Proxies},
+ Year = {1997},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAbTS9tYWx0emFobi1zaWdtZXRyaWNzOTcucGRmTxEBhAAAAAABhAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////GW1hbHR6YWhuLXNpZ21ldHJpY3M5Ny5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFNAAACAEEvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOk06bWFsdHphaG4tc2lnbWV0cmljczk3LnBkZgAADgA0ABkAbQBhAGwAdAB6AGEAaABuAC0AcwBpAGcAbQBlAHQAcgBpAGMAcwA5ADcALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACwvTXkgRHJpdmUvUGFwZXJzL00vbWFsdHphaG4tc2lnbWV0cmljczk3LnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJABCAAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAco=}}
+
+@inproceedings{weil:pdsw07,
+ Abstract = {Brick and object-based storage architectures have emerged as a means of improving the scalability of storage clusters. However, existing systems continue to treat storage nodes as passive devices, despite their ability to exhibit significant intelligence and autonomy. We present the design and implementation of RADOS, a reliable object storage service that can scales to many thousands of devices by leveraging the intelligence present in individual storage nodes. RADOS preserves consistent data access and strong safety semantics while allowing nodes to act semi-autonomously to self-manage replication, failure detection, and failure recovery through the use of a small cluster map. Our implementation offers excellent performance, reliability, and scalability while providing clients with the illusion of a single logical object store.},
+ Address = {Reno, NV},
+ Author = {Sage A. Weil and Andrew Leung and Scott A. Brandt and Carlos Maltzahn},
+ Booktitle = {Proceedings of the 2007 ACM Petascale Data Storage Workshop (PDSW 07)},
+ Date-Added = {2009-09-29 12:06:25 -0700},
+ Date-Modified = {2020-01-05 06:20:07 -0700},
+ Keywords = {papers, obsd, distributed, storage, systems, related:x10},
+ Local-Url = {/Users/carlosmalt/Documents/Papers/weil-pdsw07.pdf},
+ Month = {November},
+ Title = {RADOS: A Fast, Scalable, and Reliable Storage Service for Petabyte-scale Storage Clusters},
+ Year = {2007},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxARVy93ZWlsLXBkc3cwNy5wZGZPEQFcAAAAAAFcAAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8Pd2VpbC1wZHN3MDcucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAVcAAAIANy86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6Vzp3ZWlsLXBkc3cwNy5wZGYAAA4AIAAPAHcAZQBpAGwALQBwAGQAcwB3ADAANwAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAIi9NeSBEcml2ZS9QYXBlcnMvVy93ZWlsLXBkc3cwNy5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAOAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGY}}
+
+@inproceedings{maltzahn:usenix99,
+ Abstract = {The dramatic increase of HTTP traffic on the Internet has resulted in wide-spread use of large caching proxy servers as critical Internet infrastructure components. With continued growth the demand for larger caches and higher performance proxies grows as well. The common bottleneck of large caching proxy servers is disk I/O. In this paper we evaluate ways to reduce the amount of required disk I/O. First we compare the file system interactions of two existing web proxy servers, CERN and SQUID. Then we show how design adjustments to the current SQUID cache architecture can dramatically reduce disk I/O. Our findings suggest two that strategies can significantly reduce disk I/O: (1) preserve locality of the HTTP reference stream while translating these references into cache references, and (2) use virtual memory instead of the file system for objects smaller than the system page size. The evaluated techniques reduced disk I/O by 50% to 70%.},
+ Address = {Monterey, CA},
+ Author = {Carlos Maltzahn and Kathy Richardson and Dirk Grunwald},
+ Booktitle = {USENIX ATC '99},
+ Date-Added = {2009-09-29 12:06:25 -0700},
+ Date-Modified = {2020-01-05 06:20:58 -0700},
+ Keywords = {papers, networking, intermediary, storage, webcaching},
+ Month = {June 6-11},
+ Read = {Yes},
+ Title = {Reducing the Disk I/O of Web Proxy Server Caches},
+ Year = {1999},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAXTS9tYWx0emFobi11c2VuaXg5OS5wZGZPEQF0AAAAAAF0AAIAAAxHb29nbGUgRHJpdmUAAAAAAAAAAAAAAAAAAAAAAAAAQkQAAf////8VbWFsdHphaG4tdXNlbml4OTkucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/////wAAAAAAAAAAAAAAAAABAAMAAAoAY3UAAAAAAAAAAAAAAAAAAU0AAAIAPS86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6TTptYWx0emFobi11c2VuaXg5OS5wZGYAAA4ALAAVAG0AYQBsAHQAegBhAGgAbgAtAHUAcwBlAG4AaQB4ADkAOQAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAKC9NeSBEcml2ZS9QYXBlcnMvTS9tYWx0emFobi11c2VuaXg5OS5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAPgAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAG2}}
+
+@inproceedings{ames:mss05,
+ Abstract = {Traditional file systems provide a weak and inadequate structure for meaningful representations of file interrelationships and other context-providing metadata. Existing designs, which store additional file-oriented metadata either in a database, on disk, or both are limited by the technologies upon which they depend. Moreover, they do not provide for user-defined relationships among files. To address these issues, we created the Linking File System (LiFS), a file system design in which files may have both arbitrary user- or application-specified attributes, and attributed links between files. In order to assure performance when accessing links and attributes, the system is designed to store metadata in non-volatile memory. This paper discusses several use cases that take advantage of this approach and describes the user-space prototype we developed to test the concepts presented.
+},
+ Address = {Monterey, CA},
+ Author = {Alexander Ames and Nikhil Bobb and Scott A. Brandt and Adam Hiatt and Carlos Maltzahn and Ethan L. Miller and Alisa Neeman and Deepa Tuteja},
+ Booktitle = {MSST '05},
+ Date-Added = {2009-09-29 12:06:25 -0700},
+ Date-Modified = {2020-01-05 06:21:32 -0700},
+ Keywords = {papers, ssrc, metadata, filesystems, linking},
+ Local-Url = {/Users/carlosmalt/Documents/Papers/ames-mss05.pdf},
+ Month = {April},
+ Title = {Richer File System Metadata Using Links and Attributes},
+ Year = {2005},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAQQS9hbWVzLW1zczA1LnBkZk8RAVgAAAAAAVgAAgAADEdvb2dsZSBEcml2ZQAAAAAAAAAAAAAAAAAAAAAAAABCRAAB/////w5hbWVzLW1zczA1LnBkZgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/////AAAAAAAAAAAAAAAAAAEAAwAACgBjdQAAAAAAAAAAAAAAAAABQQAAAgA2LzpWb2x1bWVzOkdvb2dsZURyaXZlOk15IERyaXZlOlBhcGVyczpBOmFtZXMtbXNzMDUucGRmAA4AHgAOAGEAbQBlAHMALQBtAHMAcwAwADUALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACEvTXkgRHJpdmUvUGFwZXJzL0EvYW1lcy1tc3MwNS5wZGYAABMAFC9Wb2x1bWVzL0dvb2dsZURyaXZl//8AAAAIAA0AGgAkADcAAAAAAAACAQAAAAAAAAAFAAAAAAAAAAAAAAAAAAABkw==}}
+
+@inproceedings{koren:pdsw07,
+ Abstract = {As users interact with file systems of ever increasing size, it is becoming more difficult for them to familiarize themselves with the entire contents of the file system. In petabyte-scale systems, users must navigate a pool of billions of shared files in order to find the information they are looking for. One way to help alleviate this problem is to integrate navigation and search into a common framework.
+One such method is faceted search. This method originated within the information retrieval community, and has proved popular for navigating large repositories, such as those in e-commerce sites and digital libraries. This paper introduces faceted search and outlines several current research directions in adapting faceted search techniques to petabyte-scale file systems.},
+ Address = {Reno, NV},
+ Author = {Jonathan Koren and Yi Zhang and Sasha Ames and Andrew Leung and Carlos Maltzahn and Ethan L. Miller},
+ Booktitle = {Proceedings of the 2007 ACM Petascale Data Storage Workshop (PDSW 07)},
+ Date-Added = {2009-09-29 12:06:25 -0700},
+ Date-Modified = {2020-01-05 06:24:17 -0700},
+ Keywords = {papers, ir, filesystems, metadata, facets, search},
+ Month = {November},
+ Title = {Searching and Navigating Petabyte Scale File Systems Based on Facets},
+ Year = {2007},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxASSy9rb3Jlbi1wZHN3MDcucGRmTxEBYAAAAAABYAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////EGtvcmVuLXBkc3cwNy5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFLAAACADgvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOks6a29yZW4tcGRzdzA3LnBkZgAOACIAEABrAG8AcgBlAG4ALQBwAGQAcwB3ADAANwAuAHAAZABmAA8AGgAMAEcAbwBvAGcAbABlACAARAByAGkAdgBlABIAIy9NeSBEcml2ZS9QYXBlcnMvSy9rb3Jlbi1wZHN3MDcucGRmAAATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA5AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAZ0=}}
+
+@article{jarke:ijicis92,
+ Abstract = {Information systems support for design environments emphasizes object management and tends to neglect the growing demand for team support. Process management is often tackled by rigid technological protocols which are likely to get in the way of group productivity and quality. Group tools must be introduced in an unobtrusive way which extends current practice yet provides structure and documentation of development experiences. The concept of sharing processes allows agents to coordinate the sharing of ideas, tasks, and results by interacting protocol automata which can be dynamically adapted to situational requirements. Inconsistency is managed with equal emphasis as consistency. The sharing process approach has been implemented in a system called ConceptTalk which has been experimentally integrated with design environments for information and hypertext systems.},
+ Author = {Matthias Jarke and Carlos Maltzahn and Thomas Rose},
+ Date-Added = {2009-09-29 12:06:25 -0700},
+ Date-Modified = {2020-01-05 06:25:27 -0700},
+ Journal = {International Journal of Intelligent and Cooperative Information Systems},
+ Keywords = {papers, sharing, cscw, datamanagement},
+ Number = {1},
+ Pages = {145--167},
+ Title = {Sharing Processes: Team Coordination in Design Repositories},
+ Volume = {1},
+ Year = {1992},
+ Bdsk-Url-1 = {https://www.worldscientific.com/doi/abs/10.1142/S0218215792000076}}
+
+@inproceedings{ellis:hicss97,
+ Abstract = {Chautauqua is an exploratory workflow management system designed and implemented within the Collaboration Technology Research group (CTRG) at the University of Colorado. This system represents a tightly knit merger of workflow technology and groupware technology. Chautauqua has been in test usage at the University of Colorado since 1995. This document discusses Chautauqua - its motivation, its design, and its implementation. Our emphasis here is on its novel features, and the techniques for implementing these features.},
+ Address = {Wailea, Maui, HI},
+ Author = {Clarence E. Ellis and Carlos Maltzahn},
+ Booktitle = {30th Hawaii International Conference on System Sciences, Information System Track},
+ Date-Added = {2009-09-29 12:06:25 -0700},
+ Date-Modified = {2020-01-05 06:26:44 -0700},
+ Keywords = {papers, workflow, cscw},
+ Month = {January},
+ Title = {The Chautauqua Workflow System},
+ Year = {1997},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAVRS1GL2VsbGlzLWhpY3NzOTcucGRmTxEBagAAAAABagACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////EWVsbGlzLWhpY3NzOTcucGRmAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAANFLUYAAAIAOy86Vm9sdW1lczpHb29nbGVEcml2ZTpNeSBEcml2ZTpQYXBlcnM6RS1GOmVsbGlzLWhpY3NzOTcucGRmAAAOACQAEQBlAGwAbABpAHMALQBoAGkAYwBzAHMAOQA3AC4AcABkAGYADwAaAAwARwBvAG8AZwBsAGUAIABEAHIAaQB2AGUAEgAmL015IERyaXZlL1BhcGVycy9FLUYvZWxsaXMtaGljc3M5Ny5wZGYAEwAUL1ZvbHVtZXMvR29vZ2xlRHJpdmX//wAAAAgADQAaACQAPAAAAAAAAAIBAAAAAAAAAAUAAAAAAAAAAAAAAAAAAAGq}}
+
+@inproceedings{kaldewey:rtas08,
+ Abstract = {Large- and small-scale storage systems frequently serve a mixture of workloads, an increasing number of which require some form of performance guarantee. Providing guaranteed disk performance---the equivalent of a ``virtual disk''---is challenging because disk requests are non-preemptible and their execution times are stateful, partially non-deterministic, and can vary by orders of magnitude. Guaranteeing throughput, the standard measure of disk performance, requires worst-case I/O time assumptions orders of magnitude greater than average I/O times, with correspondingly low performance and poor control of the resource allocation. We show that disk time utilization--- analogous to CPU utilization in CPU scheduling and the only fully provisionable aspect of disk performance---yields greater control, more efficient use of disk resources, and better isolation between request streams than bandwidth or I/O rate when used as the basis for disk reservation and scheduling.},
+ Address = {St. Louis, Missouri},
+ Annote = {Springer Journal of Real-Time Systems Award for Best Student Paper},
+ Author = {Tim Kaldewey and Anna Povzner and Theodore Wong and Richard Golding and Scott A. Brandt and Carlos Maltzahn},
+ Booktitle = {RTAS 2008},
+ Date-Added = {2009-09-29 12:06:25 -0700},
+ Date-Modified = {2020-01-05 06:27:49 -0700},
+ Keywords = {papers, performance, management, storage, systems, fahrrad, rbed, qos},
+ Month = {April},
+ Title = {Virtualizing Disk Performance},
+ Year = {2008},
+ Bdsk-File-1 = {YnBsaXN0MDDSAQIDBFxyZWxhdGl2ZVBhdGhZYWxpYXNEYXRhXxAVSy9rYWxkZXdleS1ydGFzMDgucGRmTxEBbAAAAAABbAACAAAMR29vZ2xlIERyaXZlAAAAAAAAAAAAAAAAAAAAAAAAAEJEAAH/////E2thbGRld2V5LXJ0YXMwOC5wZGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP////8AAAAAAAAAAAAAAAAAAQADAAAKAGN1AAAAAAAAAAAAAAAAAAFLAAACADsvOlZvbHVtZXM6R29vZ2xlRHJpdmU6TXkgRHJpdmU6UGFwZXJzOks6a2FsZGV3ZXktcnRhczA4LnBkZgAADgAoABMAawBhAGwAZABlAHcAZQB5AC0AcgB0AGEAcwAwADgALgBwAGQAZgAPABoADABHAG8AbwBnAGwAZQAgAEQAcgBpAHYAZQASACYvTXkgRHJpdmUvUGFwZXJzL0sva2FsZGV3ZXktcnRhczA4LnBkZgATABQvVm9sdW1lcy9Hb29nbGVEcml2Zf//AAAACAANABoAJAA8AAAAAAAAAgEAAAAAAAAABQAAAAAAAAAAAAAAAAAAAaw=}}
+
diff --git a/publish b/publish
new file mode 100755
index 00000000000..7ebed3c30c5
--- /dev/null
+++ b/publish
@@ -0,0 +1,14 @@
+#! /bin/bash
+
+#REMOTE='carlosm@raindance.cse.ucsc.edu'
+#REMOTE='carlosm@soe.ucsc.edu'
+REMOTE='carlosm@riverdance.soe.ucsc.edu'
+RSH='ssh'
+RSYNC='/usr/bin/rsync -avz'
+SITES=~/Sites
+SITE=$SITES/UCSC
+CROSS=$SITES/CROSS
+ACADEMIC=$HOME/git/carlosmalt/ucsc-homepage
+
+(pushd $ACADEMIC; academic import --bibtex maltzahn.bib; python3 addpdfs.py; hugo; popd)
+$RSYNC -e "$RSH" $ACADEMIC/public/* $REMOTE:.html/dev/
diff --git a/static/favicon.ico b/static/favicon.ico
new file mode 100644
index 00000000000..8c52493a6bf
Binary files /dev/null and b/static/favicon.ico differ
diff --git a/test.txt b/test.txt
new file mode 100644
index 00000000000..e69de29bb2d
diff --git a/themes/academic b/themes/academic
deleted file mode 160000
index 5abeaaac7e5..00000000000
--- a/themes/academic
+++ /dev/null
@@ -1 +0,0 @@
-Subproject commit 5abeaaac7e5d65b749566e834a7b09508a1aff3a