Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a label bias functionality in tag mode #8

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

priancho
Copy link

You can specify a label bias weight in tag mode.
Passed test with a small sample data.

Run crfsuite tag -h for help

@@ -479,7 +481,8 @@ floatval_t crf1dc_viterbi(crf1d_context_t* ctx, int *labels)
cur = ALPHA_SCORE(ctx, 0);
state = STATE_SCORE(ctx, 0);
for (j = 0;j < L;++j) {
cur[j] = state[j];
// HCCHO: cur[j] = state[j];
cur[j] = state[j] + (ctx->bias)[j];
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not simply add the bias weights to the state table?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that your approach is better than my code.
I will fix it and re-commit soon.

@@ -19,13 +21,17 @@ def apply_templates(X, templates):
@type template: tuple of (str, int)
@param template: The feature template.
"""

x_range = list(range(len(X)))
x_range_set = set(x_range)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if I understand the rationale behind this. The list cast effectively pre-allocates the list, which seems unnecessary - and the named variable declaration seems like an unnecessary change.

@@ -1,5 +1,5 @@
<?xml version="1.0" encoding="utf-8"?>
<Project DefaultTargets="Build" ToolsVersion="4.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
<Project DefaultTargets="Build" ToolsVersion="14.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why these were changed?


if (opt->probability) {
floatval_t lognorm;
tagger->lognorm(tagger, &lognorm);
fprintf(fpo, "@score\t%f\t%f\n", score, lognorm);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation does not match surrounding code.

@@ -175,6 +176,18 @@ int Trainer::train(const std::string& model, int holdout)
{
int ret;

if (tr == NULL) {
std::stringstream ss;
ss << "The trainer is not initialized. Call Trainer::select before Trainer::train.";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a constant string. Why is a stream being used?


if (data->attrs == NULL || data->labels == NULL) {
std::stringstream ss;
ss << "The data is empty. Call Trainer::append before Trainer::train.";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto the constant remark above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants