-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add role splitting in markdown processor, remove old guidance regions #34
Conversation
…letionService.requestCompletion in favor of guidance message role
Update comment handling in MDP to only strip `<!---`
/review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey extremeheat, thanks for the pull request! This looks like a great update. I have a few comments and suggestions:
-
In
docs/MarkdownProcessing.md
, I think it would be helpful to explicitly state that theguidance
role should be the last message in the array. This is mentioned in the note at the end of the section, but it might be easy to miss.Here's the code I'm referring to:
Note: there can only be one guidance message and it must be the last one. You should remove it from the messages array the next call you do to requestChatCompletion. This feature works best when used with the role parsing system above.
I suggest adding a sentence like this to the main paragraph:
"The
guidance
role should always be the last message in themessages
array." -
In
src/Flow.js
, the_hash
function now usesJSON.stringify
to hash the arguments. While this works, it might be more efficient to use a dedicated hashing library likecrypto
to create a hash of the arguments. This would avoid the overhead of stringifying the arguments, especially if they are large objects.Here's the code I'm referring to:
_hash (...args) { const hash = crypto.createHash('sha1') hash.update(JSON.stringify(args)) return hash.digest('hex') }
-
In
src/tools/mdp.js
, thesegmentByRoles
function uses a loop to iterate over the text and find the role tokens. This could potentially be optimized by using a regular expression to find all the role tokens at once.Here's the code I'm referring to:
function segmentByRoles (text, roles) { // split the text into segments based on the roles const segments = [] for (let i = 0; i < text.length; i++) { for (const role in roles) { if (text.slice(i, i + role.length) === role) { segments.push({ role, start: i, end: i + role.length }) i += role.length break } } }
-
In
src/tools/stripping.js
, thetokenizeMarkdown
function has a complex logic to handle code blocks and preformatted blocks. It might be worth considering using a dedicated markdown parsing library to handle this logic, which could potentially simplify the code and make it more robust.Here's the code I'm referring to:
function tokenizeMarkdown (comment, options) { const tokens = [] let tokenSoFar = '' let inCodeBlock = false let inCodeLang let inPreTag = false let linePadding = 0
Overall, I think this pull request is well-written and introduces some valuable features. I would be happy to accept it once the above comments are addressed.
@@ -1,4 +1,4 @@ | |||
LXL contains a simple templating system that allows you to conditionally insert data into the markdown at runtime. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In docs/MarkdownProcessing.md
, I think it would be helpful to explicitly state that the guidance
role should be the last message in the array. This is mentioned in the note at the end of the section, but it might be easy to miss.
Here's the code I'm referring to:
Note: there can only be one guidance message and it must be the last one. You should remove
it from the messages array the next call you do to requestChatCompletion. This feature works
best when used with the role parsing system above.
I suggest adding a sentence like this to the main paragraph:
"The
guidance
role should always be the last message in themessages
array."
_hash (...args) { | ||
const hash = crypto.createHash('sha1') | ||
args.filter(e => e != null).map(String).forEach(arg => hash.update(arg)) | ||
hash.update(JSON.stringify(args)) | ||
return hash.digest('hex') | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In src/Flow.js
, the _hash
function now uses JSON.stringify
to hash the arguments. While this works, it might be more efficient to use a dedicated hashing library like crypto
to create a hash of the arguments. This would avoid the overhead of stringifying the arguments, especially if they are large objects.
function segmentByRoles (text, roles) { | ||
// split the text into segments based on the roles | ||
const segments = [] | ||
for (let i = 0; i < text.length; i++) { | ||
for (const role in roles) { | ||
if (text.slice(i, i + role.length) === role) { | ||
segments.push({ role, start: i, end: i + role.length }) | ||
i += role.length | ||
break | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In src/tools/mdp.js
, the segmentByRoles
function uses a loop to iterate over the text and find the role tokens. This could potentially be optimized by using a regular expression to find all the role tokens at once.
function tokenizeMarkdown (comment, options) { | ||
// console.log('Tokenize', comment) | ||
const tokens = [] | ||
let tokenSoFar = '' | ||
let inCodeBlock = false | ||
let inCodeLang | ||
let inPreTag = false | ||
let linePadding = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In src/tools/stripping.js
, the tokenizeMarkdown
function has a complex logic to handle code blocks and preformatted blocks. It might be worth considering using a dedicated markdown parsing library to handle this logic, which could potentially simplify the code and make it more robust.
540d3c2
to
fabe06a
Compare
Breaking
prompt: string
replaced withprompt: { system?: string, user?: string }
orprompt: { text: string, roles: Record<string, Role> }
to support new role handling.Features
requestCompletion()
in favor of a new "guidance" role in messages which can be used now via "requestChatCompletion"content
andtext
are returned on all completion requests, whether chat or notFixes