Skip to content

Commit bb69782

Browse files
committed
Merge branch 'main' of github.com:morpheuslord/Nmap-API into basecodeupdate
2 parents 417e0a6 + fe13284 commit bb69782

File tree

2 files changed

+224
-6
lines changed

2 files changed

+224
-6
lines changed

README.md

Lines changed: 112 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,8 @@
22
# Nmap API
33

44
Uses python3.10, Debian, python-Nmap, and flask framework to create a Nmap API that can do scans with a good speed online and is easy to deploy.
5-
6-
This is a implementation for our college PCL project which is still under development and constantly updating.
5+
The API also includes GPT3 functionality for AI generated reports.
6+
This is an implementation for our college PCL project which is still under development and constantly updating.
77

88

99
## API Reference
@@ -55,7 +55,7 @@ This is a implementation for our college PCL project which is still under develo
5555

5656
## Improvements
5757
Added GPT functionality with chunking module.
58-
The methodology is based on how `Langchain GPT embeddings` operate. Basically the operation goes like this:
58+
The methodology is based on how `Langchain GPT embeddings` operate. Basically, the operation goes like this:
5959

6060
```text
6161
Data -> Chunks_generator ─┐ ┌─> AI_Loop -> Data_Extraction -> Return_Dat
@@ -64,6 +64,24 @@ Data -> Chunks_generator ─┐ ┌─> AI_Loop -> Data_Extraction ->
6464
├─> Chunk3 ─┤
6565
└─> Chunk N ─┘
6666
```
67+
this is how to works:
68+
- **Step 1:**
69+
- The JSON is done scanning or the text is extracted and converted into a string
70+
- **Step 2:**
71+
- The long string is converted into individual tokens of words and characters for example `[]{};word` == `'[',']','{','}',';','word'`
72+
- **Step 3:**
73+
- The long list of tokens is divided into groups of lists according to how many `tokens` we want.
74+
- for our use case we have a prompt and the data extracted and for simplicity, we went with the chunks of `500 tokens` + the prompt tokens.
75+
- **Step 4:**
76+
- Step 4 can be achieved in 3 ways `a) Langchain`, `b) OpenAI functions Feature`, `c) The OpenAI API calls`
77+
- From our tests, the first option `Langchain LLM` did not work as it is not built for such processes
78+
- The second option `OpenAI functions feature` needed support and more context.
79+
- The Third was the best as we can provide the rules and output format for it to give an output.
80+
- **Step 5:**
81+
- The final step is to run the loop and `regex` the output data and return them as an output.
82+
- The reason for using regex is that `AI is unpredictable` so we need to take measures to keep our data usable.
83+
- The prompt is used as an output format making sure the AI gives that output no matter what so we can easily regex that output.
84+
6785

6886
AI code:
6987
```python
@@ -114,5 +132,96 @@ def AI(analyze: str) -> dict[str, any]:
114132
return ai_output
115133
```
116134

135+
The Prompt, Regex and extraction:
136+
```python
137+
prompt = f"""
138+
Do a vulnerability analysis report on the following JSON data provided.
139+
It's the data extracted from my network scanner.
140+
follow the following rules for analysis:
141+
1) Calculate the criticality score based on the service or CVE.
142+
2) Return all the open ports within the open_ports list.
143+
3) Return all the closed ports within the closed_ports list.
144+
4) Return all the filtered ports within the filtered_ports list.
145+
6) Keep the highest possible accuracy.
146+
7) Do not provide unwanted explanations.
147+
8) Only provide details in the output_format provided.
148+
149+
output_format: {{
150+
"open_ports": [],
151+
"closed_ports": [],
152+
"filtered_ports": [],
153+
"criticality_score": ""
154+
}}
155+
156+
data = {analize}
157+
"""
158+
```
159+
160+
The above-mentioned prompt as a distinct output format will return this output no matter the instance. These are the following things needed to be addressed:
161+
- The prompt must be detailed.
162+
- The prompt must explain all sorts of use cases and inputs.
163+
- The prompt must be guided with rules to follow.
164+
- The number of tokens must be monitored and taken care of.
165+
166+
This is the regex for it:
167+
```python
168+
def extract_ai_output(ai_output: str) -> dict[str, Any]:
169+
result = {
170+
"open_ports": [],
171+
"closed_ports": [],
172+
"filtered_ports": [],
173+
"criticality_score": ""
174+
}
175+
176+
# Match and extract ports
177+
open_ports_match = re.search(r'"open_ports": \[([^\]]*)\]', ai_output)
178+
closed_ports_match = re.search(r'"closed_ports": \[([^\]]*)\]', ai_output)
179+
filtered_ports_match = re.search(
180+
r'"filtered_ports": \[([^\]]*)\]', ai_output)
181+
182+
# If found, convert string of ports to list
183+
if open_ports_match:
184+
result["open_ports"] = list(
185+
map(cast(Callable[[Any], str], int),
186+
open_ports_match.group(1).split(',')))
187+
if closed_ports_match:
188+
result["closed_ports"] = list(
189+
map(cast(Callable[[Any], str], int),
190+
closed_ports_match.group(1).split(',')))
191+
if filtered_ports_match:
192+
result["filtered_ports"] = list(
193+
map(cast(Callable[[Any], str], int),
194+
filtered_ports_match.group(1).split(',')))
195+
196+
# Match and extract criticality score
197+
criticality_score_match = re.search(
198+
r'"criticality_score": "([^"]*)"', ai_output)
199+
if criticality_score_match:
200+
result["criticality_score"] = criticality_score_match.group(1)
201+
202+
return result
203+
```
204+
The regex makes sure all the data is extracted and returned properly within the proper type we wanted.
205+
This also helps with the data management and removal of unwanted information.
206+
207+
API Key must be mentioned
208+
```python
209+
openai.api_key = '__API__KEY__'
210+
```
211+
212+
### Package
213+
The package is a simple extension for future usage or upgrades it can be installed by running:
214+
```bash
215+
cd package && pip install .
216+
```
217+
The Usage can be implemented like this:
218+
```python
219+
from nmap_api import app
220+
221+
app.openai.api_key = '__API__KEY__'
222+
app.start_api()
223+
224+
```
225+
117226
#### Default User Keys
118227
**Default_Key**: **cff649285012c6caae4d**

package/README.md

Lines changed: 112 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,8 @@
22
# Nmap API
33

44
Uses python3.10, Debian, python-Nmap, and flask framework to create a Nmap API that can do scans with a good speed online and is easy to deploy.
5-
6-
This is a implementation for our college PCL project which is still under development and constantly updating.
5+
The API also includes GPT3 functionality for AI generated reports.
6+
This is an implementation for our college PCL project which is still under development and constantly updating.
77

88

99
## API Reference
@@ -55,7 +55,7 @@ This is a implementation for our college PCL project which is still under develo
5555

5656
## Improvements
5757
Added GPT functionality with chunking module.
58-
The methodology is based on how `Langchain GPT embeddings` operate. Basically the operation goes like this:
58+
The methodology is based on how `Langchain GPT embeddings` operate. Basically, the operation goes like this:
5959

6060
```text
6161
Data -> Chunks_generator ─┐ ┌─> AI_Loop -> Data_Extraction -> Return_Dat
@@ -64,6 +64,24 @@ Data -> Chunks_generator ─┐ ┌─> AI_Loop -> Data_Extraction ->
6464
├─> Chunk3 ─┤
6565
└─> Chunk N ─┘
6666
```
67+
this is how to works:
68+
- **Step 1:**
69+
- The JSON is done scanning or the text is extracted and converted into a string
70+
- **Step 2:**
71+
- The long string is converted into individual tokens of words and characters for example `[]{};word` == `'[',']','{','}',';','word'`
72+
- **Step 3:**
73+
- The long list of tokens is divided into groups of lists according to how many `tokens` we want.
74+
- for our use case we have a prompt and the data extracted and for simplicity, we went with the chunks of `500 tokens` + the prompt tokens.
75+
- **Step 4:**
76+
- Step 4 can be achieved in 3 ways `a) Langchain`, `b) OpenAI functions Feature`, `c) The OpenAI API calls`
77+
- From our tests, the first option `Langchain LLM` did not work as it is not built for such processes
78+
- The second option `OpenAI functions feature` needed support and more context.
79+
- The Third was the best as we can provide the rules and output format for it to give an output.
80+
- **Step 5:**
81+
- The final step is to run the loop and `regex` the output data and return them as an output.
82+
- The reason for using regex is that `AI is unpredictable` so we need to take measures to keep our data usable.
83+
- The prompt is used as an output format making sure the AI gives that output no matter what so we can easily regex that output.
84+
6785

6886
AI code:
6987
```python
@@ -114,5 +132,96 @@ def AI(analyze: str) -> dict[str, any]:
114132
return ai_output
115133
```
116134

135+
The Prompt, Regex and extraction:
136+
```python
137+
prompt = f"""
138+
Do a vulnerability analysis report on the following JSON data provided.
139+
It's the data extracted from my network scanner.
140+
follow the following rules for analysis:
141+
1) Calculate the criticality score based on the service or CVE.
142+
2) Return all the open ports within the open_ports list.
143+
3) Return all the closed ports within the closed_ports list.
144+
4) Return all the filtered ports within the filtered_ports list.
145+
6) Keep the highest possible accuracy.
146+
7) Do not provide unwanted explanations.
147+
8) Only provide details in the output_format provided.
148+
149+
output_format: {{
150+
"open_ports": [],
151+
"closed_ports": [],
152+
"filtered_ports": [],
153+
"criticality_score": ""
154+
}}
155+
156+
data = {analize}
157+
"""
158+
```
159+
160+
The above-mentioned prompt as a distinct output format will return this output no matter the instance. These are the following things needed to be addressed:
161+
- The prompt must be detailed.
162+
- The prompt must explain all sorts of use cases and inputs.
163+
- The prompt must be guided with rules to follow.
164+
- The number of tokens must be monitored and taken care of.
165+
166+
This is the regex for it:
167+
```python
168+
def extract_ai_output(ai_output: str) -> dict[str, Any]:
169+
result = {
170+
"open_ports": [],
171+
"closed_ports": [],
172+
"filtered_ports": [],
173+
"criticality_score": ""
174+
}
175+
176+
# Match and extract ports
177+
open_ports_match = re.search(r'"open_ports": \[([^\]]*)\]', ai_output)
178+
closed_ports_match = re.search(r'"closed_ports": \[([^\]]*)\]', ai_output)
179+
filtered_ports_match = re.search(
180+
r'"filtered_ports": \[([^\]]*)\]', ai_output)
181+
182+
# If found, convert string of ports to list
183+
if open_ports_match:
184+
result["open_ports"] = list(
185+
map(cast(Callable[[Any], str], int),
186+
open_ports_match.group(1).split(',')))
187+
if closed_ports_match:
188+
result["closed_ports"] = list(
189+
map(cast(Callable[[Any], str], int),
190+
closed_ports_match.group(1).split(',')))
191+
if filtered_ports_match:
192+
result["filtered_ports"] = list(
193+
map(cast(Callable[[Any], str], int),
194+
filtered_ports_match.group(1).split(',')))
195+
196+
# Match and extract criticality score
197+
criticality_score_match = re.search(
198+
r'"criticality_score": "([^"]*)"', ai_output)
199+
if criticality_score_match:
200+
result["criticality_score"] = criticality_score_match.group(1)
201+
202+
return result
203+
```
204+
The regex makes sure all the data is extracted and returned properly within the proper type we wanted.
205+
This also helps with the data management and removal of unwanted information.
206+
207+
API Key must be mentioned
208+
```python
209+
openai.api_key = '__API__KEY__'
210+
```
211+
212+
### Package
213+
The package is a simple extension for future usage or upgrades it can be installed by running:
214+
```bash
215+
cd package && pip install .
216+
```
217+
The Usage can be implemented like this:
218+
```python
219+
from nmap_api import app
220+
221+
app.openai.api_key = '__API__KEY__'
222+
app.start_api()
223+
224+
```
225+
117226
#### Default User Keys
118227
**Default_Key**: **cff649285012c6caae4d**

0 commit comments

Comments
 (0)