Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SCIM cleanup 2 #1172

Merged
merged 23 commits into from
Jul 24, 2020
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 7 additions & 41 deletions docs/developer/scim/storage.md
Original file line number Diff line number Diff line change
@@ -1,51 +1,17 @@
# Storing SCIM-related data {#DevScimStorage}

_Author: Artyom Kazak_
_Author: Artyom Kazak, Matthias Fischmann_

---

## Storing user data {#DevScimStorageUsers}

SCIM user data is stored as JSON blobs in the `scim_user` table in Spar, one blob per SCIM-managed user. Those blobs conform to the SCIM standard and are returned by `GET /scim/v2/Users`.

Note that when a user is created via SCIM, the received blob is not written verbatim to the database – it is first parsed by the [hscim](https://github.com/wireapp/hscim) library, and all unknown fields are removed.

Sample blob:

```json
{
"schemas": [
"urn:ietf:params:scim:schemas:core:2.0:User",
"urn:wire:scim:schemas:profile:1.0"
],
"id": "ef4bafda-5be8-46e3-bed2-5bcce55cff01",
"externalId": "lana@example.com",
"userName": "lana_d",
"displayName": "Lana Donohue",
"urn:wire:scim:schemas:profile:1.0": {
"richInfo": {
"version": 0,
"fields": [
{ "type": "Title", "value": "Chief Backup Officer" },
{ "type": "Favorite quote", "value": "Monads are just giant burritos" }
]
}
},
"meta": {
"resourceType": "User",
"location": "https://staging-nginz-https.zinfra.io/scim/v2/Users/ef4bafda-5be8-46e3-bed2-5bcce55cff01",
"created": "2019-04-21T04:15:12.535509602Z",
"lastModified": "2019-04-21T04:15:18.185055531Z",
"version": "W/\"e051bc17f7e07dec815f4b9314f76f88e2949a62b6aad8c816086cff85de4783\""
}
}
```

### One-way sync from Spar to Brig {#DevScimOneWaySync}

A user is considered SCIM-managed if they were provisioned with SCIM (when it's the case, `userManagedBy` will be set to `ManagedByScim`). Data about SCIM-managed users is stored both in Brig and Spar, and should always be in sync.

Currently (2019-04-29) we only implement one-way sync – when a user is modified via SCIM, Spar takes care to update data in Brig. However, user data is _not_ updated on the Spar side when it is changed in Brig, and Brig does not yet prohibit changing user data via its API – it relies on clients to be well-behaved and respect `userManagedBy`.
SCIM user data is validated by the spar service and stored as brig users. All fields that wire doesn't care about are silently dropped. `GET /scim/v2/Users` will trigger a lookup in brig, and the data thus obtained is synthesized back into a SCIM record.

Time stamps `created_at` and `last_updated_at` for the SCIM metadata are stored in `spar.scim_user_times`. The are kept in sync with the users that are otherwise stored in brig. (Rationale: we briefly considered using `select writetime(*) from brig.user` for last update and `select writetime(activated) from brig.user` for creation, but this has two drawbacks: (a) it stores data further away from spar where it is needed, and there is a bigger risk of that data going to places where it's not supposed to be; and (b) we don't have the time stamps when storing the record, so the `POST` handler would need to do a database write and a consecutive lookup, or an `insert if not exists`.)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Point (a) doesnt make sense given the timestamp is stored by cassandra; not brig

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's still further away from spar: if ti's in spar-cassandra, it goes from cassandra to spar; if it's in brig cassandra, it goes from cassandra to brig to spar, and brig holds data that it has no business holding.

how can i make this point more clear/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets just remove (a)


Users created by SCIM set the `ManagedBy` field in brig to `ManagedByScim`. This *should* lead to brig disallowing certain update operations (since the single source of truth should be the SCIM peer that has created and is updating the user), but we never got around to implementing that (as of Wed 15 Jul 2020 10:59:11 AM CEST). See also {@SparBrainDump} (grep for `ManagedBy`).
fisx marked this conversation as resolved.
Show resolved Hide resolved


## Storing SCIM tokens {#DevScimStorageTokens}

Expand Down
9 changes: 7 additions & 2 deletions docs/reference/spar-braindump.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
# Spar braindump {#SparBrainDump}

_Author: Matthias Fischmann_

---

# the spar service for user provisioning (scim) and authentication (saml) - a brain dump

this is a mix of information on inmplementation details, architecture,
Expand Down Expand Up @@ -252,8 +258,7 @@ If you can't find what you're looking for there, please add at least a
pending test case explaining what's missing.

Side note: Users in brig carry an enum type
[`ManagedBy`](https://github.com/wireapp/wire-server/blob/010ca7e460d13160b465de24dd3982a397f94c16/libs/brig-types/src/Brig/Types/Common.hs#L393-L413);
see also {#DevScimOneWaySync}. This is a half-implemented feature for
[`ManagedBy`](https://github.com/wireapp/wire-server/blob/010ca7e460d13160b465de24dd3982a397f94c16/libs/brig-types/src/Brig/Types/Common.hs#L393-L413). This is a half-implemented feature for
managing conflicts between changes via scim vs. changes from wire
clients; and does currently not affect deletability of users.

Expand Down
13 changes: 11 additions & 2 deletions libs/hscim/src/Web/Scim/Test/Acceptance.hs
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ module Web.Scim.Test.Acceptance
)
where

import Control.Concurrent (threadDelay)
import Control.Monad.IO.Class (liftIO)
import qualified Data.Aeson as Aeson
import qualified Data.ByteString as BS
Expand All @@ -41,6 +42,7 @@ import Test.Hspec.Wai (matchStatus)
import Test.Hspec.Wai.Internal (runWaiSession)
import Web.Scim.Class.User
import Web.Scim.Schema.Common as Hscim
import qualified Web.Scim.Schema.ListResponse as ListResponse
import Web.Scim.Schema.Meta
import Web.Scim.Schema.UserTypes
import Web.Scim.Test.Util
Expand Down Expand Up @@ -100,8 +102,12 @@ microsoftAzure AcceptanceConfig {..} = do
testuid =
either (error . show . (,resp)) (cs . Servant.toUrlPiece . Hscim.id . thing) $
Aeson.eitherDecode' @(StoredUser tag) (simpleBody resp)
-- Get user with query
get' queryConfig "/Users" `shouldRespondWith` 400
-- Get user by query
get' queryConfig (cs $ "/Users?filter userName eq " <> show userName1) `shouldRespondWith` 200
get' queryConfig (cs $ "/Users?filter=userName eq " <> show userName1) >>= \rsp -> liftIO $ do
simpleStatus rsp `shouldBe` status200
ListResponse.totalResults <$> Aeson.eitherDecode' @(ListResponse.ListResponse (StoredUser tag)) (simpleBody rsp) `shouldBe` Right 1
-- Get user by query, zero results
get' queryConfig (cs $ "/Users?filter=userName eq " <> show userName2)
`shouldRespondWith` [scim|
Expand All @@ -116,7 +122,10 @@ microsoftAzure AcceptanceConfig {..} = do
{ matchStatus = 200
}
-- Get user by externalId works
get' queryConfig "/Users?filter externalId eq \"0a21f0f2-8d2a-4f8e-479e-a20b-2d77186b5dd1\"" `shouldRespondWith` 200
liftIO $ threadDelay 1000000
get' queryConfig "/Users?filter=externalId eq \"0a21f0f2-8d2a-4f8e-479e-a20b-2d77186b5dd1\"" >>= \rsp -> liftIO $ do
simpleStatus rsp `shouldBe` status200
ListResponse.totalResults <$> Aeson.eitherDecode' @(ListResponse.ListResponse (StoredUser tag)) (simpleBody rsp) `shouldBe` Right 1
-- Update user [Multi-valued properties]
ignore $
patch'
Expand Down
2 changes: 1 addition & 1 deletion libs/wire-api/src/Wire/API/User/Profile.hs
Original file line number Diff line number Diff line change
Expand Up @@ -240,7 +240,7 @@ parseCountry = hush . parseOnly countryParser
-- ManagedBy

-- | Who controls changes to the user profile (where the profile is defined as "all
-- user-editable, user-visible attributes"). See {#DevScimOneWaySync}.
-- user-editable, user-visible attributes"). See {#SparBrainDump}.
data ManagedBy
= -- | The profile can be changed in-app; user doesn't show up via SCIM at all.
ManagedByWire
Expand Down
2 changes: 1 addition & 1 deletion services/brig/src/Brig/API/User.hs
Original file line number Diff line number Diff line change
Expand Up @@ -337,7 +337,7 @@ checkRestrictedUserCreation new = do
-- Update Profile

-- FUTUREWORK: this and other functions should refuse to modify a ManagedByScim user. See
-- {#DevScimOneWaySync}
-- {#SparBrainDump}

updateUser :: UserId -> ConnId -> UserUpdate -> AppIO ()
updateUser uid conn uu = do
Expand Down
4 changes: 3 additions & 1 deletion services/spar/schema/src/Main.hs
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ import qualified V5
import qualified V6
import qualified V7
import qualified V8
import qualified V9

main :: IO ()
main = do
Expand All @@ -49,7 +50,8 @@ main = do
V5.migration,
V6.migration,
V7.migration,
V8.migration
V8.migration,
V9.migration
-- When adding migrations here, don't forget to update
-- 'schemaVersion' in Spar.Data

Expand Down
38 changes: 38 additions & 0 deletions services/spar/schema/src/V9.hs
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
-- This file is part of the Wire Server implementation.
--
-- Copyright (C) 2020 Wire Swiss GmbH <opensource@wire.com>
--
-- This program is free software: you can redistribute it and/or modify it under
-- the terms of the GNU Affero General Public License as published by the Free
-- Software Foundation, either version 3 of the License, or (at your option) any
-- later version.
--
-- This program is distributed in the hope that it will be useful, but WITHOUT
-- ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
-- FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more
-- details.
--
-- You should have received a copy of the GNU Affero General Public License along
-- with this program. If not, see <https://www.gnu.org/licenses/>.

module V9
( migration,
)
where

import Cassandra.Schema
import Imports
import Text.RawString.QQ

migration :: Migration
migration = Migration 9 "As replacement for `scim_user`, add smaller table for storing time stamps only" $ do
void $
schema'
[r|
CREATE TABLE if not exists scim_user_times
( uid uuid
, created_at timestamp
, last_updated_at timestamp
, primary key (uid)
) with compaction = {'class': 'LeveledCompactionStrategy'};
|]
arianvp marked this conversation as resolved.
Show resolved Hide resolved
5 changes: 3 additions & 2 deletions services/spar/spar.cabal
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
cabal-version: 1.12

-- This file has been generated from package.yaml by hpack version 0.31.2.
-- This file has been generated from package.yaml by hpack version 0.33.0.
--
-- see: https://github.com/sol/hpack
--
-- hash: 84ff7e7ee95f7419098b8029990a04ec98e64a1ed2ea41c534bea60d87be1a33
-- hash: 032057d81f563685925f83bd11795dfde9ab38ec5b711c1a2aa5319a365d45a5

name: spar
version: 0.1
Expand Down Expand Up @@ -312,6 +312,7 @@ executable spar-schema
V6
V7
V8
V9
Paths_spar
hs-source-dirs:
schema/src
Expand Down
78 changes: 29 additions & 49 deletions services/spar/src/Spar/Data.hs
Original file line number Diff line number Diff line change
Expand Up @@ -75,18 +75,18 @@ module Spar.Data
deleteScimToken,
deleteTeamScimTokens,

-- * SCIM user records
insertScimUser,
getScimUser,
getScimUsers,
deleteScimUser,
-- * SCIM user timestampes
writeScimUserTimes,
readScimUserTimes,
deleteScimUserTimes,
)
where

import Cassandra as Cas
import Control.Lens
import Control.Monad.Except
import Data.Id
import Data.Json.Util (UTCTimeMillis, toUTCTimeMillis)
import qualified Data.List.NonEmpty as NL
import Data.Misc ((<$$>))
import Data.String.Conversions
Expand All @@ -96,16 +96,16 @@ import GHC.TypeLits (KnownSymbol)
import Imports
import qualified SAML2.WebSSO as SAML
import Spar.Data.Instances (VerdictFormatCon, VerdictFormatRow, fromVerdictFormat, toVerdictFormat)
import Spar.Scim.Types
import Spar.Types
import Text.RawString.QQ
import URI.ByteString
import qualified Web.Cookie as Cky
import qualified Web.Scim.Class.User as ScimC.User
import Web.Scim.Schema.Common (WithId (..))
import Web.Scim.Schema.Meta (Meta (..), WithMeta (..))

-- | A lower bound: @schemaVersion <= whatWeFoundOnCassandra@, not @==@.
schemaVersion :: Int32
schemaVersion = 8
schemaVersion = 9

----------------------------------------------------------------------
-- helpers
Expand Down Expand Up @@ -701,55 +701,35 @@ deleteTeamScimTokens team = do
--
-- docs/developer/scim/storage.md {#DevScimStorageUsers}

-- | Store the scim user in its entirety and return the 'Scim.StoredUser'.
--
-- NB: we can add optional columns in the future and extract parts of the json blob should the need
-- arise. For instance, if we want to support different versions of SCIM, we could extract
-- @schemas@ and, throw an exception if the list of values is not supported, and store it
-- in a separate column otherwise, allowing for fast version filtering on the database.
insertScimUser ::
(HasCallStack, MonadClient m) =>
UserId ->
ScimC.User.StoredUser SparTag ->
m ()
insertScimUser uid usr =
-- | Store creation and last-update time from the scim metadata under a user id.
writeScimUserTimes :: (HasCallStack, MonadClient m) => WithMeta (WithId UserId a) -> m ()
writeScimUserTimes (WithMeta meta (WithId uid _)) =
retry x5 . write ins $
params Quorum (uid, WrappedScimStoredUser usr)
params
Quorum
( uid,
toUTCTimeMillis $ created meta,
toUTCTimeMillis $ lastModified meta
)
where
ins :: PrepQuery W (UserId, WrappedScimStoredUser SparTag) ()
ins = "INSERT INTO scim_user (id, json) VALUES (?, ?)"
ins :: PrepQuery W (UserId, UTCTimeMillis, UTCTimeMillis) ()
ins = "INSERT INTO scim_user_times (uid, created_at, last_updated_at) VALUES (?, ?, ?)"

getScimUser ::
(HasCallStack, MonadClient m) =>
UserId ->
m (Maybe (ScimC.User.StoredUser SparTag))
getScimUser uid =
fromWrappedScimStoredUser . runIdentity
<$$> (retry x1 . query1 sel $ params Quorum (Identity uid))
where
sel :: PrepQuery R (Identity UserId) (Identity (WrappedScimStoredUser SparTag))
sel = "SELECT json FROM scim_user WHERE id = ?"

-- | Return all users that can be found under a given list of 'UserId's. If some cannot be found,
-- the output list will just be shorter (no errors).
getScimUsers ::
(HasCallStack, MonadClient m) =>
[UserId] ->
m [ScimC.User.StoredUser SparTag]
getScimUsers uids =
fromWrappedScimStoredUser . runIdentity
<$$> retry x1 (query sel (params Quorum (Identity uids)))
-- | Read creation and last-update time from database for a given user id.
readScimUserTimes :: (HasCallStack, MonadClient m) => UserId -> m (Maybe (UTCTimeMillis, UTCTimeMillis))
readScimUserTimes uid = do
retry x1 . query1 sel $ params Quorum (Identity uid)
where
sel :: PrepQuery R (Identity [UserId]) (Identity (WrappedScimStoredUser SparTag))
sel = "SELECT json FROM scim_user WHERE id in ?"
sel :: PrepQuery R (Identity UserId) (UTCTimeMillis, UTCTimeMillis)
sel = "SELECT created_at, last_updated_at FROM scim_user_times WHERE uid = ?"

-- | Delete a SCIM user by id.
-- | Delete a SCIM user's access times by id.
-- You'll also want to ensure they are deleted in Brig and in the SAML Users table.
deleteScimUser ::
deleteScimUserTimes ::
(HasCallStack, MonadClient m) =>
UserId ->
m ()
deleteScimUser uid = retry x5 . write del $ params Quorum (Identity uid)
deleteScimUserTimes uid = retry x5 . write del $ params Quorum (Identity uid)
where
del :: PrepQuery W (Identity UserId) ()
del = "DELETE FROM scim_user WHERE id = ?"
del = "DELETE FROM scim_user_times WHERE uid = ?"
21 changes: 0 additions & 21 deletions services/spar/src/Spar/Data/Instances.hs
Original file line number Diff line number Diff line change
Expand Up @@ -32,18 +32,14 @@ module Spar.Data.Instances
where

import Cassandra as Cas
import Data.Aeson (FromJSON, ToJSON)
import qualified Data.Aeson as Aeson
import Data.String.Conversions
import Data.X509 (SignedCertificate)
import Imports
import SAML2.Util (parseURI')
import qualified SAML2.WebSSO as SAML
import Spar.Scim.Types
import Spar.Types
import Text.XML.DSig (parseKeyInfo, renderKeyInfo)
import URI.ByteString
import qualified Web.Scim.Schema.User as Scim

instance Cql SAML.XmlText where
ctype = Tagged TextColumn
Expand Down Expand Up @@ -105,20 +101,3 @@ toVerdictFormat (VerdictFormatConMobile, Just succredir, Just errredir) = Just $
toVerdictFormat _ = Nothing

deriving instance Cql ScimToken

instance
( Scim.UserTypes tag,
uid ~ Scim.UserId tag,
extra ~ Scim.UserExtra tag,
FromJSON extra,
ToJSON extra,
FromJSON uid,
ToJSON uid
) =>
Cql (WrappedScimStoredUser tag)
where
ctype = Tagged BlobColumn
toCql = CqlBlob . Aeson.encode . fromWrappedScimStoredUser

fromCql (CqlBlob t) = WrappedScimStoredUser <$> Aeson.eitherDecode t
fromCql _ = fail "Scim.StoredUser: expected CqlBlob"
Loading