Skip to content

Commit

Permalink
Add the code and README
Browse files Browse the repository at this point in the history
  • Loading branch information
aspacca committed May 14, 2021
1 parent 533ec0b commit b0cf91f
Show file tree
Hide file tree
Showing 4 changed files with 1,743 additions and 0 deletions.
85 changes: 85 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,87 @@
# slack-matrix-importer
Import a Slack history export to a Matrix server

This README will explain requirements and how to run the script in order to import an export of a Slack workspace data to a Matrix server.
The code is based from https://github.com/matrix-org/matrix-appservice-bridge/blob/develop/HOWTO.md

You need to have:
- A working homeserver install
- An export of a Slack workspace data (https://slack.com/intl/en-de/help/articles/201658943-Export-your-workspace-data)
- `npm` and `nodejs`
- `mapped_channels.json` and `mapped_users.json` files to map from Slack to Matrix users and channels

NB: This how-to refers to the binary `node` - this may be `nodejs` depending on your distro.

# Install dependencies
Checkout the code and enter the directory.
Run `npm install` to install the required dependencies.
```
$ git checkout https://github.com/aspacca/slack-matrix-importer.git
$ cd slack-matrix-importer
$ npm install
```


## Registering as an application service
The scrip setup a CLI via the `Cli` class, which will dump the registration file to
`slack-matrix-importer-registration.yaml`. It will register the user ID `@slackbot:domain` and ask
for exclusive rights (so no one else can create them) to the namespace of every users. It also generates two tokens which will be used for authentication.

Now type `DOMAIN=localhost HOMSEVER_URL=http://localhost:9000 node index.js -r -u "http://localhost:9000"` (HOMSERVER_URL and the last url param are the same URL that the
homeserver will try to use to communicate with the application service, DOMAIN is the DOMAIN of the homserver) and a file
`slack-matrix-importer-registration.yaml` will be produced. In your Synapse install, edit
`homeserver.yaml` to include this file:
```yaml
app_service_config_files: ["/path/to/slack/matrix/importer/slack-matrix-importer-registration.yaml"]
```
Then restart your homeserver. Your application service is now registered.
## Extracting the Slack workspace data export
You need to extract the Slack workspace data export. You can export everywhere since you will be able to point to that directory later in the process (replace SLACK-WORKSPACE-DATA.zip with the name of your export archive file):
```
$ cd /tmp
$ unzip SLACK-WORKSPACE-DATA.zip
```


## Defining mapping for users and channels
We need to create two json files to map the id of the users and channels in Slack to the ones on the homeserver.

- `mapped_users.json`

From `/tmp/SLACK-WORKSPACE-DATA/users.json` find the `id` field of every user
in the Slack server and use them as key of a json object with the id of the matching Matrix user:
```json
{
"UD34L1FHJ":"@an_user:your-homeserver-domain.com",
"UL09E7XNM":"@another_user:your-homeserver-domain.com"
}
```

- `mapped_channels.json`

From `/tmp/SLACK-WORKSPACE-DATA/channels.json` find the `name` field of every channel
in the Slack server and use them as key of a json object with the id of the matching Matrix channel:
```json
{
"a-slack-channel":"!lTyPKoNMeWDiOPlfHn:your-homeserver-domain.com",
"another-slack-channel":"!KdbHjErWcXpEQfGMki:your-homeserver-domain.com"
}
```

NB: The `name` field used as key in the json must match the subfolders
with the json files containing the channels messages inside `/tmp/SLACK-WORKSPACE-DATA`

# Run the import
Run the app service with `DOMAIN=localhost HOMSEVER_URL=http://localhost:9000 IMPORT_FOLDER=/tmp/SLACK-WORKSPACE-DATA node index.js -p 9000` and wait until the last message is print to console and imported on matrix.
Once it's done you can exit with CTRL+C


# Notes
Messages that don't have an entry (either for the user or the channel) in the mapping json files won't be imported.

The messages are sent to the homserver with the `Intent` object obtained from the bridge (https://github.com/matrix-org/matrix-appservice-bridge/blob/develop/README.md#intent).
This would make sure that user you are importing the message from joined to the room first before sending the message.
If the user cannot join the room the message is sent to an exception is thrown and the message won't be imported: it's suggested to have all the users on the homeserver already joined the rooms to import for.


178 changes: 178 additions & 0 deletions index.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
// Usage:
// node index.js -r -u "http://localhost:9000" # remember to add the registration!
// node index.js -p 9000
const fs = require('fs');
const Cli = require("matrix-appservice-bridge").Cli;
const Bridge = require("matrix-appservice-bridge").Bridge;
const AppServiceRegistration = require("matrix-appservice-bridge").AppServiceRegistration;

async function asyncForEach(array, callback) {
for (let index = 0; index < array.length; index++) {
await callback(array[index], index, array);
}
}

const CHANNELS = require('./mapped_channels.json');
const USERS = require('./mapped_users.json');

function replaceUserCitation(str) {
return str.replace(/<@([^>]+)>/g, function (citation, userID) {
if (userID in USERS) {
return USERS[userID];
} else {
return '@' + citation;
}
});
}


var htmlEntities = {
nbsp: ' ',
cent: '¢',
pound: '£',
yen: '¥',
euro: '€',
copy: '©',
reg: '®',
lt: '<',
gt: '>',
quot: '"',
amp: '&',
apos: '\''
};

function unescapeHTML(str) {
return str.replace(/\&([^;]+);/g, function (entity, entityCode) {
var match;

if (entityCode in htmlEntities) {
return htmlEntities[entityCode];
/*eslint no-cond-assign: 0*/
} else if (match = entityCode.match(/^#x([\da-fA-F]+)$/)) {
return String.fromCharCode(parseInt(match[1], 16));
/*eslint no-cond-assign: 0*/
} else if (match = entityCode.match(/^#(\d+)$/)) {
return String.fromCharCode(~~match[1]);
} else {
return entity;
}
});
}

let bridge;

const DOMAIN = process.env.DOMAIN;
const IMPORT_FOLDER = process.env.IMPORT_FOLDER;
const HOMESERVER_URL = process.env.HOMESERVER_URL;

if (undefined === DOMAIN || undefined === IMPORT_FOLDER || undefined === HOMESERVER_URL) {
console.log("Please define all ENV variables (DOMAIN, IMPORT_FOLDER, HOMESERVER_URL)");
process.exit(255);
}

new Cli({
registrationPath: "slack-matrix-importer-registration.yaml",
generateRegistration: function(reg, callback) {
reg.setId(AppServiceRegistration.generateToken());
reg.setHomeserverToken(AppServiceRegistration.generateToken());
reg.setAppServiceToken(AppServiceRegistration.generateToken());
reg.setSenderLocalpart("slackbot");
callback(reg);
},
run: function(port, config) {
bridge = new Bridge({
homeserverUrl: HOMESERVER_URL,
domain: DOMAIN,
registration: "slack-matrix-importer-registration.yaml",

controller: {
onUserQuery: function(queriedUser) {
return {}; // auto-provision users with no additonal data
},

onEvent: function(request, context) {
}
}
});
console.log("Matrix-side listening on port %s", port);
bridge.run(port, config).then(async () => {
let importFolders = fs.readdirSync(IMPORT_FOLDER).sort();
await asyncForEach(importFolders, async (channel) => {
let currentFolder = IMPORT_FOLDER + '/' + channel;
let stat = fs.statSync(currentFolder);
if (!stat.isDirectory()) {
return;
}

let importJSON = fs.readdirSync(currentFolder).sort();
await asyncForEach(importJSON, async (currentJson) => {
let jsonFile = currentFolder + '/' + currentJson;
if (currentJson.split('.').pop() !== 'json') {
return;
}

let rawdata = fs.readFileSync(jsonFile);
let messages = JSON.parse(rawdata).sort(function(a,b) {
if (a.hasOwnProperty("ts") && b.hasOwnProperty("ts")) {
if (a["ts"] > b["ts"]) {
return 1;
} else if (a["ts"] < b["ts"]) {
return -1;
}
}

return 0;
});

await asyncForEach(messages, async (msg) => {
if (!msg.hasOwnProperty('user')) {
return;
}

if (!msg.hasOwnProperty('text')) {
return;
}
if (!msg.hasOwnProperty('type') || msg.type !== "message") {
return;
}

if (msg.hasOwnProperty('subtype') && msg.subtype !== "thread_broadcast") {
return;
}

if (!USERS.hasOwnProperty(msg.user)) {
return;
}

if (!CHANNELS.hasOwnProperty(channel)) {
return;
}

let text;
if (msg.hasOwnProperty('subtype') && msg.subtype === "thread_broadcast") {
if (msg.hasOwnProperty('root') && msg.root.hasOwnProperty("text")) {
text = '> ' + msg.root.text + "\n\n" + msg.text;
} else {
text = msg.text;
}
} else {
text = msg.text;
}

text = unescapeHTML(text);
text = replaceUserCitation(text);

let userID = USERS[msg.user];
let roomID = CHANNELS[channel];

let intent = bridge.getIntent(userID);
await console.log(msg.ts, roomID, userID, jsonFile, text);
await intent.sendText(roomID, text);
});
});
});
}).catch(function(error) {
console.log(error)
});
}
}).run();
Loading

0 comments on commit b0cf91f

Please sign in to comment.