Skip to content

Commit

Permalink
Merge remote-tracking branch 'IQSS/develop' into IQSS/IQSS#8380-count…
Browse files Browse the repository at this point in the history
…erprocesor_version_update
  • Loading branch information
qqmyers committed Apr 14, 2022
2 parents 61d8933 + 4e991dd commit 52e0e1c
Show file tree
Hide file tree
Showing 66 changed files with 1,297 additions and 660 deletions.
6 changes: 3 additions & 3 deletions conf/docker-aio/1prep.sh
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,10 @@ cd ../../
cp -r scripts conf/docker-aio/testdata/
cp doc/sphinx-guides/source/_static/util/createsequence.sql conf/docker-aio/testdata/doc/sphinx-guides/source/_static/util/

wget -q https://downloads.apache.org/maven/maven-3/3.8.4/binaries/apache-maven-3.8.4-bin.tar.gz
tar xfz apache-maven-3.8.4-bin.tar.gz
wget -q https://downloads.apache.org/maven/maven-3/3.8.5/binaries/apache-maven-3.8.5-bin.tar.gz
tar xfz apache-maven-3.8.5-bin.tar.gz
mkdir maven
mv apache-maven-3.8.4/* maven/
mv apache-maven-3.8.5/* maven/
echo "export JAVA_HOME=/usr/lib/jvm/jre-openjdk" > maven/maven.sh
echo "export M2_HOME=../maven" >> maven/maven.sh
echo "export MAVEN_HOME=../maven" >> maven/maven.sh
Expand Down
1 change: 1 addition & 0 deletions doc/release-notes/8525-ingest-optional-skip.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Tabular ingest can be skipped via API.
16 changes: 16 additions & 0 deletions doc/sphinx-guides/source/_static/api/dataset-create_en.jsonld
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
{
"http://purl.org/dc/terms/title": "Darwin's Finches",
"http://purl.org/dc/terms/subject": "Medicine, Health and Life Sciences",
"http://schema.org/inLanguage":"en",
"http://purl.org/dc/terms/creator": {
"https://dataverse.org/schema/citation/author#Name": "Finch, Fiona",
"https://dataverse.org/schema/citation/author#Affiliation": "Birds Inc."
},
"https://dataverse.org/schema/citation/Contact": {
"https://dataverse.org/schema/citation/datasetContact#E-mail": "finch@mailinator.com",
"https://dataverse.org/schema/citation/datasetContact#Name": "Finch, Fiona"
},
"https://dataverse.org/schema/citation/Description": {
"https://dataverse.org/schema/citation/dsDescription#Text": "Darwin's finches (also known as the Galápagos finches) are a group of about fifteen species of passerine birds."
}
}
10 changes: 5 additions & 5 deletions doc/sphinx-guides/source/admin/mail-groups.rst
Original file line number Diff line number Diff line change
@@ -1,19 +1,19 @@
Mail Domain Groups
==================

Groups can be defined based on the domain part of users (verified) email addresses. Email addresses that match
one or more groups configuration will add the user to them.
Groups can be defined based on the domain part of users (verified) email addresses. Email addresses that match one or more groups configuration will add the user to them.

Within the scientific community, in many cases users will use a institutional email address for their account in a
Dataverse installation. This might offer a simple solution for building groups of people, as the domain part can be
seen as a selector for group membership.
Within the scientific community, in many cases users will use a institutional email address for their account in a Dataverse installation. This might offer a simple solution for building groups of people, as the domain part can be seen as a selector for group membership.

Some use cases: installations that like to avoid Shibboleth, enable self sign up, offer multi-tenancy or can't use
:doc:`ip-groups` plus many more.

.. hint:: Please be aware that non-verified mail addresses will exclude the user even if matching. This is to avoid
privilege escalation.

.. contents:: Contents:
:local:

Listing Mail Domain Groups
--------------------------

Expand Down
7 changes: 4 additions & 3 deletions doc/sphinx-guides/source/api/native-api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -459,7 +459,7 @@ To create a dataset, you must supply a JSON file that contains at least the foll
- Description
- Subject
As a starting point, you can download :download:`dataset-finch1.json <../../../../scripts/search/tests/data/dataset-finch1.json>` and modify it to meet your needs. (In addition to this minimal example, you can download :download:`dataset-create-new-all-default-fields.json <../../../../scripts/api/data/dataset-create-new-all-default-fields.json>` which populates all of the metadata fields that ship with a Dataverse installation.)
As a starting point, you can download :download:`dataset-finch1.json <../../../../scripts/search/tests/data/dataset-finch1.json>` and modify it to meet your needs. (:download:`dataset-create-new-all-default-fields.json <../../../../scripts/api/data/dataset-finch1_fr.json>` is a variant of this file that includes setting the metadata language (see :ref:`:MetadataLanguages`) to French (fr). In addition to this minimal example, you can download :download:`dataset-create-new-all-default-fields.json <../../../../scripts/api/data/dataset-create-new-all-default-fields.json>` which populates all of the metadata fields that ship with a Dataverse installation.)
The curl command below assumes you have kept the name "dataset-finch1.json" and that this file is in your current working directory.
Expand Down Expand Up @@ -1301,6 +1301,7 @@ When adding a file to a dataset, you can optionally specify the following:
- A description of the file.
- The "File Path" of the file, indicating which folder the file should be uploaded to within the dataset.
- Whether or not the file is restricted.
- Whether or not the file skips :doc:`tabular ingest </user/tabulardataingest/index>`. If the ``tabIngest`` parameter is not specified, it defaults to ``true``.
Note that when a Dataverse instance is configured to use S3 storage with direct upload enabled, there is API support to send a file directly to S3. This is more complex and is described in the :doc:`/developers/s3-direct-upload-api` guide.
Expand All @@ -1315,13 +1316,13 @@ In the curl example below, all of the above are specified but they are optional.
export SERVER_URL=https://demo.dataverse.org
export PERSISTENT_ID=doi:10.5072/FK2/J8SJZB
curl -H X-Dataverse-key:$API_TOKEN -X POST -F "file=@$FILENAME" -F 'jsonData={"description":"My description.","directoryLabel":"data/subdir1","categories":["Data"], "restrict":"false"}' "$SERVER_URL/api/datasets/:persistentId/add?persistentId=$PERSISTENT_ID"
curl -H X-Dataverse-key:$API_TOKEN -X POST -F "file=@$FILENAME" -F 'jsonData={"description":"My description.","directoryLabel":"data/subdir1","categories":["Data"], "restrict":"false", "tabIngest":"false"}' "$SERVER_URL/api/datasets/:persistentId/add?persistentId=$PERSISTENT_ID"
The fully expanded example above (without environment variables) looks like this:
.. code-block:: bash
curl -H X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx -X POST -F file=@data.tsv -F 'jsonData={"description":"My description.","directoryLabel":"data/subdir1","categories":["Data"], "restrict":"false"}' "https://demo.dataverse.org/api/datasets/:persistentId/add?persistentId=doi:10.5072/FK2/J8SJZB"
curl -H X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx -X POST -F file=@data.tsv -F 'jsonData={"description":"My description.","directoryLabel":"data/subdir1","categories":["Data"], "restrict":"false", "tabIngest":"false"}' "https://demo.dataverse.org/api/datasets/:persistentId/add?persistentId=doi:10.5072/FK2/J8SJZB"
You should expect a 201 ("CREATED") response and JSON indicating the database id that has been assigned to your newly uploaded file.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -99,5 +99,5 @@ With curl, this is done by adding the following header:
curl -H X-Dataverse-key:$API_TOKEN -H 'Content-Type: application/ld+json' -X POST $SERVER_URL/api/dataverses/$DATAVERSE_ID/datasets --upload-file dataset-create.jsonld
An example jsonld file is available at :download:`dataset-create.jsonld <../_static/api/dataset-create.jsonld>`
An example jsonld file is available at :download:`dataset-create.jsonld <../_static/api/dataset-create.jsonld>` (:download:`dataset-create_en.jsonld <../_static/api/dataset-create.jsonld>` is a version that sets the metadata language (see :ref:`:MetadataLanguages`) to English (en).)

3 changes: 2 additions & 1 deletion doc/sphinx-guides/source/installation/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -730,7 +730,8 @@ Allowing the Language Used for Dataset Metadata to be Specified
Since dataset metadata can only be entered in one language, and administrators may wish to limit which languages metadata can be entered in, Dataverse also offers a separate setting defining allowed metadata languages.
The presence of the :ref:`:MetadataLanguages` database setting identifies the available options (which can be different from those in the :Languages setting above, with fewer or more options).

Dataverse collection admins can select from these options to indicate which language should be used for new Datasets created with that specific collection. If they do not, users will be asked when creating a dataset to select the language they want to use when entering metadata.
Dataverse collection admins can select from these options to indicate which language should be used for new Datasets created with that specific collection. If they do not, users will be asked when creating a dataset to select the language they want to use when entering metadata.
Similarly, when this setting is defined, Datasets created/imported/migrated are required to specify a metadataLanguage compatible with the collection's requirement.

When creating or editing a dataset, users will be asked to enter the metadata in that language. The metadata language selected will also be shown when dataset metadata is viewed and will be included in metadata exports (as appropriate for each format) for published datasets:

Expand Down
26 changes: 20 additions & 6 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@
<flyway.version>5.2.4</flyway.version>
<jhove.version>1.20.1</jhove.version>
<jacoco.version>0.8.7</jacoco.version>
<poi.version>5.2.1</poi.version>
<tika.version>2.3.0</tika.version>
</properties>

<!-- Versions of dependencies used both directly and transitive are managed here.
Expand Down Expand Up @@ -200,11 +202,18 @@
<artifactId>omnifaces</artifactId>
<version>3.8</version> <!-- Or 1.8-SNAPSHOT -->
</dependency>

<dependency>
<groupId>jakarta.validation</groupId>
<artifactId>jakarta.validation-api</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.hibernate.validator</groupId>
<artifactId>hibernate-validator</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<dependency>
<groupId>org.glassfish</groupId>
<artifactId>jakarta.el</artifactId>
<scope>provided</scope>
Expand Down Expand Up @@ -286,17 +295,17 @@
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi</artifactId>
<version>4.1.1</version>
<version>${poi.version}</version>
</dependency>
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml</artifactId>
<version>4.1.1</version>
<version>${poi.version}</version>
</dependency>
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-scratchpad</artifactId>
<version>4.1.1</version>
<version>${poi.version}</version>
</dependency>
<dependency>
<groupId>org.openpreservation.jhove</groupId>
Expand Down Expand Up @@ -488,8 +497,13 @@
<!-- Full text indexing -->
<dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-parsers</artifactId>
<version>1.27</version>
<artifactId>tika-core</artifactId>
<version>${tika.version}</version>
</dependency>
<dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-parsers-standard-package</artifactId>
<version>${tika.version}</version>
</dependency>
<!-- Named Entity Recognition -->
<dependency>
Expand Down
78 changes: 78 additions & 0 deletions scripts/api/data/dataset-finch1_fr.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
{
"metadataLanguage": "fr",
"datasetVersion": {
"metadataBlocks": {
"citation": {
"fields": [
{
"value": "Darwin's Finches",
"typeClass": "primitive",
"multiple": false,
"typeName": "title"
},
{
"value": [
{
"authorName": {
"value": "Finch, Fiona",
"typeClass": "primitive",
"multiple": false,
"typeName": "authorName"
},
"authorAffiliation": {
"value": "Birds Inc.",
"typeClass": "primitive",
"multiple": false,
"typeName": "authorAffiliation"
}
}
],
"typeClass": "compound",
"multiple": true,
"typeName": "author"
},
{
"value": [
{ "datasetContactEmail" : {
"typeClass": "primitive",
"multiple": false,
"typeName": "datasetContactEmail",
"value" : "finch@mailinator.com"
},
"datasetContactName" : {
"typeClass": "primitive",
"multiple": false,
"typeName": "datasetContactName",
"value": "Finch, Fiona"
}
}],
"typeClass": "compound",
"multiple": true,
"typeName": "datasetContact"
},
{
"value": [ {
"dsDescriptionValue":{
"value": "Darwin's finches (also known as the Galápagos finches) are a group of about fifteen species of passerine birds.",
"multiple":false,
"typeClass": "primitive",
"typeName": "dsDescriptionValue"
}}],
"typeClass": "compound",
"multiple": true,
"typeName": "dsDescription"
},
{
"value": [
"Medicine, Health and Life Sciences"
],
"typeClass": "controlledVocabulary",
"multiple": true,
"typeName": "subject"
}
],
"displayName": "Citation Metadata"
}
}
}
}
5 changes: 4 additions & 1 deletion src/main/java/ValidationMessages.properties
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ user.firstName=Please enter your first name.
user.lastName=Please enter your last name.
user.invalidEmail=Please enter a valid email address.
user.enterUsername=Please enter a username.
user.usernameLength=Username must be between 2 and 60 characters.
user.usernameLength=Username must be between {min} and {max} characters.
user.illegalCharacters=Found an illegal character(s). Valid characters are a-Z, 0-9, '_', '-', and '.'.

user.enterNickname=Please enter a nickname.
Expand Down Expand Up @@ -42,3 +42,6 @@ password.validate=Password reset page default email message.
guestbook.name=Enter a name for the guestbook
guestbook.response.nameLength=Please limit response to 255 characters
email.invalid=is not a valid email address.
url.invalid=is not a valid URL.
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,9 @@
import java.util.regex.Pattern;
import javax.validation.ConstraintValidator;
import javax.validation.ConstraintValidatorContext;

import edu.harvard.iq.dataverse.validation.EMailValidator;
import edu.harvard.iq.dataverse.validation.URLValidator;
import org.apache.commons.lang3.StringUtils;
import org.apache.commons.validator.routines.UrlValidator;

Expand Down Expand Up @@ -165,29 +168,19 @@ public boolean isValid(DatasetFieldValue value, ConstraintValidatorContext conte
// Note, length validation for FieldType.TEXT was removed to accommodate migrated data that is greater than 255 chars.

if (fieldType.equals(FieldType.URL) && !lengthOnly) {

String[] schemes = {"http","https", "ftp"};
UrlValidator urlValidator = new UrlValidator(schemes);

try {
if (urlValidator.isValid(value.getValue())) {
} else {
context.buildConstraintViolationWithTemplate(dsfType.getDisplayName() + " " + value.getValue() + " is not a valid URL.").addConstraintViolation();
return false;
}
} catch (NullPointerException npe) {
boolean isValidUrl = URLValidator.isURLValid(value.getValue());
if (!isValidUrl) {
context.buildConstraintViolationWithTemplate(dsfType.getDisplayName() + " " + value.getValue() + " {url.invalid}").addConstraintViolation();
return false;
}

}

if (fieldType.equals(FieldType.EMAIL) && !lengthOnly) {
if(value.getDatasetField().isRequired() && value.getValue()==null){
boolean isValidMail = EMailValidator.isEmailValid(value.getValue());
if (!isValidMail) {
context.buildConstraintViolationWithTemplate(dsfType.getDisplayName() + " " + value.getValue() + " {email.invalid}").addConstraintViolation();
return false;
}

return EMailValidator.isEmailValid(value.getValue(), context);

}

return true;
Expand Down
3 changes: 2 additions & 1 deletion src/main/java/edu/harvard/iq/dataverse/DatasetPage.java
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@

import edu.harvard.iq.dataverse.util.StringUtil;
import edu.harvard.iq.dataverse.util.SystemConfig;
import edu.harvard.iq.dataverse.validation.URLValidator;
import edu.harvard.iq.dataverse.workflows.WorkflowComment;

import java.io.File;
Expand Down Expand Up @@ -3591,7 +3592,7 @@ public String save() {
// have been created in the dataset.
dataset = datasetService.find(dataset.getId());

List<DataFile> filesAdded = ingestService.saveAndAddFilesToDataset(dataset.getEditVersion(), newFiles, null);
List<DataFile> filesAdded = ingestService.saveAndAddFilesToDataset(dataset.getEditVersion(), newFiles, null, true);
newFiles.clear();

// and another update command:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -550,12 +550,12 @@ private boolean compareVarGroup(FileMetadata fmdo, FileMetadata fmdn) {
}
}

private boolean compareFileMetadatas(FileMetadata fmdo, FileMetadata fmdn) {
public static boolean compareFileMetadatas(FileMetadata fmdo, FileMetadata fmdn) {

if (!StringUtils.equals(fmdo.getDescription(), fmdn.getDescription())) {
if (!StringUtils.equals(StringUtil.nullToEmpty(fmdo.getDescription()), StringUtil.nullToEmpty(fmdn.getDescription()))) {
return false;
}

if (!StringUtils.equals(fmdo.getCategoriesByName().toString(), fmdn.getCategoriesByName().toString())) {
return false;
}
Expand Down
3 changes: 2 additions & 1 deletion src/main/java/edu/harvard/iq/dataverse/DataverseContact.java
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,8 @@
import javax.persistence.JoinColumn;
import javax.persistence.ManyToOne;
import javax.persistence.Table;
import org.hibernate.validator.constraints.Email;

import edu.harvard.iq.dataverse.validation.ValidateEmail;
import org.hibernate.validator.constraints.NotBlank;

/**
Expand Down
Loading

0 comments on commit 52e0e1c

Please sign in to comment.