Skip to content

Compressed archive file download and extraction with native types/providers for Windows and Unix

License

Notifications You must be signed in to change notification settings

voxpupuli/puppet-archive

Puppet Archive

License Build Status Code Coverage Puppet Forge Puppet Forge - downloads Puppet Forge - endorsement Puppet Forge - scores Camptocamp compatible

Table of Contents

  1. Overview
  2. Module Description
  3. Setup
  4. Usage
  5. Reference
  6. Development

Overview

This module manages download, deployment, and cleanup of archive files.

Module Description

This module uses types and providers to download and manage compress files, with optional lifecycle functionality such as checksum, extraction, and cleanup. The benefits over existing modules such as puppet-staging:

  • Implemented via types and provider instead of exec resource.
  • Follows 302 redirect and propagate download failure.
  • Optional checksum verification of archive files.
  • Automatic dependency to parent directory.
  • Support Windows file extraction via 7zip or PowerShell (Zip file only).
  • Able to cleanup archive files after extraction.

This module is compatible with camptocamp/archive. For this it provides compatibility shims.

Setup

On Windows 7zip is required to extract all archives except zip files which will be extracted with PowerShell if 7zip is not available (requires System.IO.Compression.FileSystem/Windows 2012+). Windows clients can install 7zip via include 'archive'. On posix systems, curl is the default provider. The default provider can be overwritten by configuring resource defaults in site.pp:

Archive {
  provider => 'ruby',
}

Users of the module are responsible for archive package dependencies, for alternative providers and all extraction utilities such as tar, gunzip, bunzip:

if $facts['osfamily'] != 'windows' {
  package { 'wget':
    ensure => present,
  }

  package { 'bunzip':
    ensure => present,
  }

  Archive {
    provider => 'wget',
    require  => Package['wget', 'bunzip'],
  }
}

Usage

Archive module dependencies are managed by the archive class. This is only required on Windows. By default 7zip is installed via chocolatey, but the MSI package can be installed instead:

class { 'archive':
  seven_zip_name     => '7-Zip 9.20 (x64 edition)',
  seven_zip_source   => 'C:/Windows/Temp/7z920-x64.msi',
  seven_zip_provider => 'windows',
}

To automatically load archives as part of this class you can define the archives parameter.

class { 'archive':
  archives => { '/tmp/jta-1.1.jar' => {
                  'ensure' => 'present',
                  'source'  => 'http://central.maven.org/maven2/javax/transaction/jta/1.1/jta-1.1.jar',
                  }, }
}

Usage Example

Simple example that downloads from web server:

archive { '/tmp/vagrant.deb':
  ensure => present,
  source => 'https://releases.hashicorp.com/vagrant/2.2.3/vagrant_2.2.3_x86_64.deb',
  user   => 0,
  group  => 0,
}

More complex example :

include 'archive' # NOTE: optional for posix platforms

archive { '/tmp/jta-1.1.jar':
  ensure        => present,
  extract       => true,
  extract_path  => '/tmp',
  source        => 'http://central.maven.org/maven2/javax/transaction/jta/1.1/jta-1.1.jar',
  checksum      => '2ca09f0b36ca7d71b762e14ea2ff09d5eac57558',
  checksum_type => sha1,
  creates       => '/tmp/javax',
  cleanup       => true,
}

archive { '/tmp/test100k.db':
  source   => 'ftp://ftp.otenet.gr/test100k.db',
  username => 'speedtest',
  password => 'speedtest',
}

If you want to extract a .tar.gz file:

$install_path        = '/opt/wso2'
$package_name        = 'wso2esb'
$package_ensure      = '4.9.0'
$repository_url      = 'http://company.com/repository/wso2'
$archive_name        = "${package_name}-${package_ensure}.tgz"
$wso2_package_source = "${repository_url}/${archive_name}"

archive { $archive_name:
  path         => "/tmp/${archive_name}",
  source       => $wso2_package_source,
  extract      => true,
  extract_path => $install_path,
  creates      => "${install_path}/${package_name}-${package_ensure}",
  cleanup      => true,
  require      => File['wso2_appdir'],
}

Puppet URL

Since march 2017, the Archive type also supports puppet URLs. Here is an example of how to use this:

archive { '/home/myuser/help':
  source        => 'puppet:///modules/profile/help.tar.gz',
  extract       => true,
  extract_path  => $homedir,
  creates       => "${homedir}/help" #directory inside tgz
}

File permission

When extracting files as non-root user, either ensure the target directory exists with the appropriate permission (see tomcat.pp for full working example):

$dirname = 'apache-tomcat-9.0.0.M3'
$filename = "${dirname}.zip"
$install_path = "/opt/${dirname}"

file { $install_path:
  ensure => directory,
  owner  => 'tomcat',
  group  => 'tomcat',
  mode   => '0755',
}

archive { $filename:
  path          => "/tmp/${filename}",
  source        => 'http://www-eu.apache.org/dist/tomcat/tomcat-9/v9.0.0.M3/bin/apache-tomcat-9.0.0.M3.zip',
  checksum      => 'f2aaf16f5e421b97513c502c03c117fab6569076',
  checksum_type => sha1,
  extract       => true,
  extract_path  => '/opt',
  creates       => "${install_path}/bin",
  cleanup       => true,
  user          => 'tomcat',
  group         => 'tomcat',
  require       => File[$install_path],
}

or use an subscribing exec to chmod the directory afterwards:

$dirname = 'apache-tomcat-9.0.0.M3'
$filename = "${dirname}.zip"
$install_path = "/opt/${dirname}"

file { '/opt/tomcat':
  ensure => 'link',
  target => $install_path
}

archive { $filename:
  path          => "/tmp/${filename}",
  source        => "http://www-eu.apache.org/dist/tomcat/tomcat-9/v9.0.0.M3/bin/apache-tomcat-9.0.0.M3.zip",
  checksum      => 'f2aaf16f5e421b97513c502c03c117fab6569076',
  checksum_type => sha1,
  extract       => true,
  extract_path  => '/opt',
  creates       => "${install_path}/bin",
  cleanup       => true,
  require       => File[$install_path],
}

exec { 'tomcat permission':
  command   => "chown tomcat:tomcat $install_path",
  path      => $path,
  subscribe => Archive[$filename],
}

Network files

For large binary files that needs to be extracted locally, instead of copying the file from the network fileshare, simply set the file path to be the same as the source and archive will use the network file location:

archive { '/nfs/repo/software.zip':
  source        => '/nfs/repo/software.zip'
  extract       => true,
  extract_path  => '/opt',
  checksum_type => none,   # typically unecessary
  cleanup       => false,  # keep the file on the server
}

Extract Customization

The extract_flags or extract_command parameters can be used to override the default extraction command/flag (defaults are specified in achive.rb).

# tar striping directories:
archive { '/var/lib/kafka/kafka_2.10-0.8.2.1.tgz':
  ensure          => present,
  extract         => true,
  extract_command => 'tar xfz %s --strip-components=1',
  extract_path    => '/opt/kafka_2.10-0.8.2.1',
  cleanup         => true,
  creates         => '/opt/kafka_2.10-0.8.2.1/config',
}

# zip freshen existing files (zip -of %s instead of zip -o %s):
archive { '/var/lib/example.zip':
  extract       => true,
  extract_path  => '/opt',
  extract_flags => '-of',
  cleanup       => true,
  subscribe     => ...,
}

S3 bucket

S3 support is implemented via the AWS CLI. On non-Windows systems, the archive class will install this dependency when the aws_cli_install parameter is set to true:

class { 'archive':
  aws_cli_install => true,
}

# See AWS cli guide for credential and configuration settings:
# http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html
file { '/root/.aws/credentials':
  ensure => file,
  ...
}
file { '/root/.aws/config':
  ensure => file,
  ...
}

archive { '/tmp/gravatar.png':
  ensure => present,
  source => 's3://bodecoio/gravatar.png',
}

NOTE: Alternative s3 provider support can be implemented by overriding the s3_download method:

GS bucket

GSUtil support is implemented via the GSUtil Package. On non-Windows systems, the archive class will install this dependency when the gsutil_install parameter is set to true:

class { 'archive':
  gsutil_install => true,
}

# See Google Cloud SDK cli guide for credential and configuration settings:
# https://cloud.google.com/storage/docs/quickstart-gsutil

archive { '/tmp/gravatar.png':
  ensure => present,
  source => 'gs://bodecoio/gravatar.png',
}

Passing headers

Sometimes headers need to be passed to source. This can be accomplished using headers parameter:

archive { '/tmp/slack-desktop-4.28.184-amd64.deb':
  ensure        => present,
  extract       => true,
  extract_path  => '/tmp',
  source        => 'https://downloads.slack-edge.com/releases/linux/4.28.184/prod/x64/slack-desktop-4.28.184-amd64.deb',
  checksum      => 'e5d63dc6bd112d40c97f210af4c5f66444d4d5e8',
  checksum_type => sha1,
  headers       => ['Authorization: OAuth ABC123']
  creates       => '/usr/local/bin/slack',
  cleanup       => true,
}

Download customizations

In some cases you may need custom flags for curl/wget/s3/gsutil which can be supplied via download_options. Since this parameter is provider specific, beware of the order of defaults:

  • s3:// files accepts aws cli options

    archive { '/tmp/gravatar.png':
      ensure           => present,
      source           => 's3://bodecoio/gravatar.png',
      download_options => ['--region', 'eu-central-1'],
    }
  • puppet provider override:

    archive { '/tmp/jta-1.1.jar':
      ensure           => present,
      source           => 'http://central.maven.org/maven2/javax/transaction/jta/1.1/jta-1.1.jar',
      provider         => 'wget',
      download_options => '--continue',
    }
  • Linux default provider is curl, and Windows default is ruby (no effect).

This option can also be applied globally to address issues for specific OS:

if $facts['osfamily'] != 'RedHat' {
  Archive {
    download_options => '--tlsv1',
  }
}

Migrating from puppet-staging

It is recommended to use puppet-archive instead of puppet-staging. Users wishing to migrate may find the following examples useful.

puppet-staging (without extraction)

class { 'staging':
  path  => '/tmp/staging',
}

staging::file { 'master.zip':
  source => 'https://github.com/voxpupuli/puppet-archive/archive/master.zip',
}

puppet-archive (without extraction)

archive { '/tmp/staging/master.zip':
  source => 'https://github.com/voxpupuli/puppet-archive/archive/master.zip',
}

puppet-staging (with zip file extraction)

class { 'staging':
  path  => '/tmp/staging',
}

staging::file { 'master.zip':
  source  => 'https://github.com/voxpupuli/puppet-archive/archive/master.zip',
} ->
staging::extract { 'master.zip':
  target  => '/tmp/staging/master.zip',
  creates => '/tmp/staging/puppet-archive-master',
}

puppet-archive (with zip file extraction)

archive { '/tmp/staging/master.zip':
  source       => 'https://github.com/voxpupuli/puppet-archive/archive/master.zip',
  extract      => true,
  extract_path => '/tmp/staging',
  creates      => '/tmp/staging/puppet-archive-master',
  cleanup      => false,
}

Reference

Classes

  • archive: install 7zip package (Windows only) and aws cli or gsutil for s3/gs support. It also permits passing an archives argument to generate archive resources.
  • archive::staging: install package dependencies and creates staging directory for backwards compatibility. Use the archive class instead if you do not need the staging directory.

Define Resources

  • archive::artifactory: archive wrapper for JFrog Artifactory files with checksum.
  • archive::go: archive wrapper for GO Continuous Delivery files with checksum.
  • archive::nexus: archive wrapper for Sonatype Nexus files with checksum.
  • archive::download: archive wrapper and compatibility shim for camptocamp/archive. This is considered private API, as it has to change with camptocamp/archive. For this reason it will remain undocumented, and removed when no longer needed . We suggest not using it directly. Instead please consider migrating to archive itself where possible.

Resources

Archive

  • ensure: whether archive file should be present/absent (default: present)
  • path: namevar, archive file fully qualified file path.
  • filename: archive file name (derived from path).
  • source: archive file source, supports http|https|ftp|file|s3|gs uri.
  • headers: array of headers to pass source, like an authentication token
  • username: username to download source file.
  • password: password to download source file.
  • allow_insecure: Ignore HTTPS certificate errors (true|false). (default: false)
  • cookie: archive file download cookie.
  • checksum_type: archive file checksum type (none|md5|sha1|sha2|sha256|sha384| sha512). (default: none)
  • checksum: archive file checksum (match checksum_type)
  • checksum_url: archive file checksum source (instead of specify checksum)
  • checksum_verify: whether checksum will be verified (true|false). (default: true)
  • extract: whether archive will be extracted after download (true|false). (default: false)
  • extract_path: target folder path to extract archive.
  • extract_command: custom extraction command ('tar xvf example.tar.gz'), also support sprintf format ('tar xvf %s') which will be processed with the filename: sprintf('tar xvf %s', filename)
  • temp_dir: Specify an alternative temporary directory to use for copying files, if unset then the operating system default will be used.
  • extract_flags: custom extraction options, this replaces the default flags. A string such as 'xvf' for a tar file would replace the default xf flag. A hash is useful when custom flags are needed for different platforms. {'tar' => 'xzf', '7z' => 'x -aot'}.
  • user: extract command user (using this option will configure the archive file permission to 0644 so the user can read the file).
  • group: extract command group (using this option will configure the archive file permission to 0644 so the user can read the file).
  • cleanup: whether archive file will be removed after extraction (true|false). (default: true)
  • creates: if file/directory exists, will not download/extract archive. If extract and cleanup are both true, this should be set to prevent Puppet from re-downloading and re-extracting the archive every run.
  • proxy_server: specify a proxy server, with port number if needed. ie: https://example.com:8080.
  • proxy_type: proxy server type (none|http|https|ftp)

Archive::Artifactory

  • path: fully qualified filepath for the download the file or use archive_path and only supply filename. (namevar).
  • ensure: ensure the file is present/absent.
  • url: artifactory download url filepath. NOTE: replaces server, port, url_path parameters.
  • server: artifactory server name (deprecated).
  • port: artifactory server port (deprecated).
  • url_path: artifactory file path http:://{server}:{port}/artifactory/{url_path} (deprecated).
  • owner: file owner (see archive params for defaults).
  • group: file group (see archive params for defaults).
  • mode: file mode (see archive params for defaults).
  • archive_path: the parent directory of local filepath.
  • extract: whether to extract the files (true/false).
  • creates: the file created when the archive is extracted (true/false).
  • cleanup: remove archive file after file extraction (true/false).
  • headers: array of headers to pass source

Archive::Artifactory Example

  • retrieve gradle without authentication

    $dirname = 'gradle-1.0-milestone-4-20110723151213+0300'
    $filename = "${dirname}-bin.zip"
    
    archive::artifactory { $filename:
      archive_path => '/tmp',
      url          => "http://repo.jfrog.org/artifactory/distributions/org/gradle/${filename}",
      extract      => true,
      extract_path => '/opt',
      creates      => "/opt/${dirname}",
      cleanup      => true,
    }
    
    file { '/opt/gradle':
      ensure => link,
      target => "/opt/${dirname}",
    }
  • retrieve gradle with api token:

    $dirname = 'gradle-1.0-milestone-4-20110723151213+0300'
    $filename = "${dirname}-bin.zip"
    
    archive::artifactory { $filename:
      archive_path => '/tmp',
      url          => "http://repo.jfrog.org/artifactory/distributions/org/gradle/${filename}",
      headers      => ['X-JFrog-Art-Api: ABC123'],
      extract      => true,
      extract_path => '/opt',
      creates      => "/opt/${dirname}",
      cleanup      => true,
    }
    
    file { '/opt/gradle':
      ensure => link,
      target => "/opt/${dirname}",
    }
  • setup resource defaults

    $artifactory_authentication = lookup('jfrog_token')
    
    Archive::Artifactory {
      headers => ["X-JFrog-Art-Api: ${artifactory_authentication}"],
    }

Archive::Nexus

Archive::Nexus Example

archive::nexus { '/tmp/jtstand-ui-0.98.jar':
  url        => 'https://oss.sonatype.org',
  gav        => 'org.codehaus.jtstand:jtstand-ui:0.98',
  repository => 'codehaus-releases',
  packaging  => 'jar',
  extract    => false,
}

Development

We highly welcome new contributions to this module, especially those that include documentation, and rspec tests ;) but will happily guide you through the process, so, yes, please submit that pull request!

Note: If you are writing a dependent module that include specs in it, you will need to set the puppetversion fact in your puppet-rspec tests. You can do that by adding it to the default facts of your spec/spec_helper.rb:

RSpec.configure do |c|
  c.default_facts = { :puppetversion => Puppet.version }
end