-
Notifications
You must be signed in to change notification settings - Fork 41
/
Copy pathftools.sthlp
584 lines (473 loc) · 20.7 KB
/
ftools.sthlp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
{smcl}
{* *! version 2.49.1 08aug2023}{...}
{vieweralsosee "fegen" "help fegen"}{...}
{vieweralsosee "fcollapse" "help fcollapse"}{...}
{vieweralsosee "join" "help join"}{...}
{vieweralsosee "fmerge" "help fmerge"}{...}
{vieweralsosee "flevelsof" "help flevelsof"}{...}
{vieweralsosee "fisid" "help fisid"}{...}
{vieweralsosee "fsort" "help fsort"}{...}
{vieweralsosee "" "--"}{...}
{vieweralsosee "[R] egen" "help egen"}{...}
{vieweralsosee "[R] collapse" "help collapse"}{...}
{vieweralsosee "[R] contract" "help contract"}{...}
{vieweralsosee "[R] merge" "help merge"}{...}
{vieweralsosee "[R] levelsof" "help levelsof"}{...}
{vieweralsosee "[R] sort" "help sort"}{...}
{vieweralsosee "" "--"}{...}
{vieweralsosee "moremata" "help moremata"}{...}
{vieweralsosee "reghdfe" "help reghdfe"}{...}
{viewerjumpto "Syntax" "ftools##syntax"}{...}
{viewerjumpto "Creation" "ftools##creation"}{...}
{viewerjumpto "Properties and methods" "ftools##properties"}{...}
{viewerjumpto "Description" "ftools##description"}{...}
{viewerjumpto "Usage" "ftools##usage"}{...}
{viewerjumpto "Example" "ftools##example"}{...}
{viewerjumpto "Remarks" "ftools##remarks"}{...}
{viewerjumpto "Using functions from collapse" "ftools##collapse"}{...}
{viewerjumpto "Experimental/advanced" "ftools##experimental"}{...}
{viewerjumpto "Source code" "ftools##source"}{...}
{viewerjumpto "Author" "ftools##contact"}{...}
{title:Title}
{p2colset 5 15 20 2}{...}
{p2col :{cmd:FTOOLS} {hline 2}}Mata commands for factor variables{p_end}
{p2colreset}{...}
{marker syntax}{...}
{title:Syntax}
{p 8 16 2}
{it:class Factor scalar}
{bind: }{cmd:factor(}{space 3}{it:varnames} [{space 1}
{cmd:,}
{it:touse}{cmd:,}
{it:verbose}{cmd:,}
{it:method}{cmd:,}
{it:sort_levels}{cmd:,}
{it:count_levels}{cmd:,}
{it:hash_ratio}{cmd:,}
{it:save_keys}]{cmd:)}
{p 8 16 2}
{it:class Factor scalar}
{bind: }{cmd:_factor(}{it:data} [{cmd:,}
{it:integers_only}{cmd:,}
{it:verbose}{cmd:,}
{it:method}{cmd:,}
{it:sort_levels}{cmd:,}
{it:count_levels}{cmd:,}
{it:hash_ratio}{cmd:,}
{it:save_keys}]{cmd:)}
{p 8 16 2}
{it:class Factor scalar}
{bind: }{cmd:join_factors(}{it:F1}{cmd:,}
{it:F2} [{cmd:,}
{it:count_levels}{cmd:,}
{it:save_keys}{cmd:,}
{it:levels_as_keys}]{cmd:)}
{marker arguments}{...}
{synoptset 38 tabbed}{...}
{synopthdr}
{synoptline}
{p2coldent:* {it:string} varnames}names of variables that identify the factors{p_end}
{synopt:{it:string} touse}name of dummy {help mark:touse} variable{p_end}
{p2coldent:}{bf:note:} you can also pass a vector with the obs. index (i.e. the first argument of {cmd:st_data()}){p_end}
{synopt:{it:string} data}transmorphic matrix with the group identifiers{p_end}
{synopt:{bf:Advanced options:}}{p_end}
{synopt:{it:real} verbose}1 to display debug information{p_end}
{synopt:{it:string} method}hashing method: mata, hash0, hash1, hash2; default is {it:mata} (auto-choose){p_end}
{synopt:{it:real} sort_levels}set to 0 under {it:hash1} to increase speed, but the new levels will not match the order of the varlist{p_end}
{synopt:{it:real} count_levels}set to 0 under {it:hash0} to increase speed, but the {it:F.counts} vector will not be generated
so F{cmd:.panelsetup()}, F{cmd:.drop_obs()}, and related methods will not be available{p_end}
{synopt:{it:real} hash_ratio}size of the hash vector compared to the maximum number of keys (often num. obs.){p_end}
{synopt:{it:real} save_keys}set to 0 to increase speed and save memory,
but the matrix {it:F.keys} with the original values of the factors
won't be created{p_end}
{synopt:{it:string} integers_only}whether {it:data} is numeric and takes only {it:integers} or not (unless you are sure of the former, set it to 0){p_end}
{synopt:{it:real} levels_as_keys}if set to 1,
{cmd:join_factors()} will use the levels of F1 and F2
as the keys (as the data) when creating F12{p_end}
{p2colreset}{...}
{marker creation}{...}
{title:Creating factor objects}
{pstd}(optional) First, you can declare the Factor object:
{p 8 8 2}
{cmd:class Factor scalar}{it: F}{break}
{pstd}Then, you can create a factor from one or more categorical variables:
{p 8 8 2}
{it:F }{cmd:=}{bind: }{cmd:factor(}{it:varnames}{cmd:)}
{pstd}
If the categories are already in Mata
({cmd:data = st_data(., varnames)}), you can do:
{p 8 8 2}
{it:F }{cmd:=}{bind: }{cmd:_factor(}{it:data}{cmd:)}
{pstd}
You can also combine two factors ({it:F1} and {it:F2}):
{p 8 8 2}
{it:F }{cmd:=}{bind: }{cmd:join_factors(}{it:F1}{cmd:,} {it:F2}{cmd:)}
{pstd}
Note that the above is exactly equivalent (but faster) than:
{p 8 8 2}
{it: varnames} {cmd:= invtokens((}{it:F1.varnames}{cmd:,} {it:F2.varnames}{cmd:))}{break}
{it:F} {cmd:=} {cmd:factor(}{it:varnames}{cmd:)}
{pstd}
If {it:levels_as_keys==1}, it is equivalent to:
{p 8 8 2}
{it:F }{cmd:=}{bind: }{cmd:_factor((}{it:F1.levels}{cmd:,} {it:F2.levels}{cmd:))}
{marker properties}{...}
{title:Properties and Methods}
{marker arguments}{...}
{synoptset 38 tabbed}{...}
{synopthdr:properties}
{synoptline}
{synopt:{it:real} F{cmd:.num_levels}}number of levels (distinct values) of the factor{p_end}
{synopt:{it:real} F{cmd:.num_obs}}number of observations of the sample used to create the factor ({cmd:c(N)} if touse was empty){p_end}
{synopt:{it:real colvector} F{cmd:.levels}}levels of the factor; dimension {cmd:F.num_obs x 1}; range: {cmd:{1, ..., F.num_levels}}{p_end}
{synopt:{it:transmorphic matrix} F{cmd:.keys}}values of the input varlist that correspond to the factor levels;
dimension {cmd:F.num_levels x 1}; not created if save_keys==0; unordered if sort_levels==0{p_end}
{synopt:{it:real vector} F{cmd:.counts}}frequencies of each level (in the sample set by touse);
dimension {cmd:F.num_levels x 1}; will be empty if count_levels==0{p_end}
{synopt:{it:string rowvector} F{cmd:.varlist}}name of variables used to create the factor{p_end}
{synopt:{it:string rowvector} F{cmd:.varformats}}formats of the input variables{p_end}
{synopt:{it:string rowvector} F{cmd:.varlabels}}labels of the input variables{p_end}
{synopt:{it:string rowvector} F{cmd:.varvaluelabels}}value labels attached to the input variables{p_end}
{synopt:{it:string rowvector} F{cmd:.vartypes}}types of the input variables{p_end}
{synopt:{it:string rowvector} F{cmd:.vl}}value label definitions used by the input variables{p_end}
{synopt:{it:string} F{cmd:.touse}}name of touse variable{p_end}
{synopt:{it:string} F{cmd:.is_sorted}}1 if the dataset is sorted by F{cmd:.varlist}{p_end}
{synopthdr:main methods}
{synoptline}
{synopt:{it:void} F{cmd:.store_levels(}{newvar}{cmd:)}}save
the levels back into the dataset (using the same {it:touse}){p_end}
{synopt:{it:void} F{cmd:.store_keys(}[{it:sort}]{cmd:)}}save
the original key variables into a reduced dataset, including formatting and labels. If {it:sort} is 1, Stata will report the dataset as sorted{p_end}
{synopt:{it:void} F{cmd:.panelsetup()}}compute auxiliary vectors {it:F.info}
and {it:F.p} (see below); used in panel computations{p_end}
{synopthdr:ancilliary methods}
{synoptline}
{synopt:{it:real scalar} F{cmd:.equals(}F2{cmd:)}}1
if {it:F} represents the same data as {it:F2}
(i.e. if .num_obs .num_levels .levels .keys and .counts are equal)
{p_end}
{synopt:{it:real scalar} F{opt .nested_within(vec)}}1
if the factor {it:F} is
{browse "http://scorreia.com/software/reghdfe/faq.html#what-does-fixed-effect-nested-within-cluster-means":nested within}
the column vector {it:vec}
(i.e. if any two obs. with the same factor level also have the same value of {it:vec}).
For instance, it is true if the factor {it:F} represents counties and {it:vec} represents states.
{p_end}
{synopt:{it:void} F{cmd:.drop_obs(}{it:idx}{cmd:)}}update
{it:F} to reflect a change in the underlying dataset, where
the observations listed in the column vector {it:idx} are dropped
(see example below)
{p_end}
{synopt:{it:void} F{cmd:.keep_obs(}{it:idx}{cmd:)}}equivalent
to keeping only the obs. enumerated by {it:idx} and recreating {it:F};
uses {cmd:.drop_obs()}
{p_end}
{synopt:{it:void} F{cmd:.drop_if(}{it:vec}{cmd:)}}equivalent
to dropping the obs. where {it:vec==0} and recreating {it:F};
uses {cmd:.drop_obs()}
{p_end}
{synopt:{it:void} F{cmd:.keep_if(}{it:vec}{cmd:)}}equivalent
to keeping the obs. where {it:vec!=0} and recreating {it:F};
uses {cmd:.drop_obs()}
{p_end}
{synopt:{it:real colvector} F{cmd:.drop_singletons()}}equivalent
to dropping the levels that only appear once,
and their corresponding observations.
The colvector returned contains the observations that need to be excluded
(note: see the source code for some advanced optional arguments).
{p_end}
{synopt:{it:real scalar} F{opt .is_id()}}1
if {it:F.counts} is always 1
(i.e. if {it:F.levels} has no duplicates)
{p_end}
{synopt:{it:real vector} F{cmd:.intersect(}{it:vec}{cmd:)}}return
a mask vector equal to 1 if the row of {it:vec} is also on F.keys.
Also accepts the integers_only and verbose options: {it:mask = F.intersect(y, 1, 1)}
{p_end}
{synopthdr:available after F.panelsetup()}
{synoptline}
{synopt:{it:transmorphic matrix} F{cmd:.sort(}{it:data}{cmd:)}}equivalent to
{cmd:data[F.p, .]}
but calls {cmd:F.panelsetup()} if required; {it:data} is a {it:transmorphic matrix}{p_end}
{synopt:{it:transmorphic matrix} F{cmd:.invsort(}{it:data}{cmd:)}}equivalent to
{cmd:data[invorder(F.p), .]}, so it undoes a previous sort operation. Note that {cmd:F.invsort(F.sort(x))==x}. Also, after used it fills the vector {cmd:F.inv_p = invorder(F.p)} so the operation can be repeated easily.
{p_end}
{synopt:{it:void} F{cmd:._sort(}{it:data}{cmd:)}}in-place version of
{cmd:.sort()};
slower but uses less memory, as it's based on {cmd:_collate()}{p_end}
{synopt:{it:real vector} F{cmd:.info}}equivalent to {help mf_panelsetup:panelsetup()}
(returns a {it:(num_levels X 2)} matrix with start and end positions of each level/panel).{p_end}
{p2coldent:}{bf:note:} instead of using {cmd:F.info} directly, use panelsubmatrix():
{cmd:x = panelsubmatrix(X, i, F.info)} and {cmd:panelsum()}(see example at the end){p_end}
{synopt:{it:real vector} F{cmd:.p}}equivalent to {cmd:order(F.levels)}
but implemented with a counting sort that is asymptotically
faster ({it:O(N)} instead of {it:O(N log N)}.{p_end}
{p2coldent:}{bf:note:} do not use {cmd:F.p} directly, as it will be missing if the data is already sorted by the varnames.{p_end}
{p2colreset}{...}
{pstd}Notes:
{synoptset 3 tabbed}{...}
{synopt:- }If you just downloaded the package and want to use the Mata functions directly (instead of the Stata commands), run {stata ftools} once to, which creates the Mata library if needed.{p_end}
{synopt:- }To force compilation of the Mata library, type {stata ftools, compile}{p_end}
{synopt:- }{cmd:F.extra} is an undocumented {help mf_asarray:asarray}
that can be used to store additional information: {cmd:asarray(f.extra, "lorem", "ipsum")};
and retrieve it: {cmd:ipsum = asarray(f.extra, "lorem")}{p_end}
{synopt:- }{cmd:join_factors()} is particularly fast if the dataset is sorted in the same order as the factors{p_end}
{synopt:- }{cmd:factor()} will call {cmd:join_factors()} if appropriate
(2+ integer variables; 10,000+ obs; and method=hash1)
{p_end}
{marker description}{...}
{title:Description}
{pstd}
The {it:Factor} object is a key component of several commands that
manipulate data without having to sort it beforehand:
{pmore}- {help fcollapse} (alternative to collapse, contract, collapse+merge and some egen functions){p_end}
{pmore}- {help fegen:fegen group}{p_end}
{pmore}- {help fisid}{p_end}
{pmore}- {help join} and {help fmerge} (alternative to m:1 and 1:1 merges){p_end}
{pmore}- {help flevelsof} plug-in alternative to {help levelsof}{p_end}
{pmore}- {help fsort} (note: this is O(N) but with a high constant term){p_end}
{pmore}- freshape{p_end}
Ancilliary commands include:
{pmore}- {help local_inlist} return local {it:inlist} based on a variable and a list of values or labels{p_end}
{pstd}
It rearranges one or more categorical variables into a new variable that takes values from 1 to F.num_levels. You can then efficiently sort any other variable by this, in order to compute groups statistics and other manipulations.
{pstd}
For technical information, see
{browse "http://stackoverflow.com/questions/8991709/why-are-pandas-merges-in-python-faster-than-data-table-merges-in-r/8992714#8992714":[1]}
{browse "http://wesmckinney.com/blog/nycpython-1102012-a-look-inside-pandas-design-and-development/":[2]},
and to a lesser degree
{browse "https://my.vertica.com/docs/7.1.x/HTML/Content/Authoring/AnalyzingData/Optimizations/AvoidingGROUPBYHASHWithProjectionDesign.htm":[3]}.
{marker usage}{...}
{title:Usage}
{pstd}
If you only want to create identifiers based on one or more variables,
run something like:
{inp}
{hline 60}
sysuse auto, clear
mata: F = factor("foreign turn")
mata: F.store_levels("id")
mata: mata drop F
{hline 60}
{txt}
{pstd}
More complex scenarios would involve some of the following:
{inp}
{hline 60}
sysuse auto, clear
* Create factors for foreign data only
mata: F = factor("turn", "foreign")
* Report number of levels, obs. in sample, and keys
mata: F.num_levels
mata: F.num_obs
mata: F.keys, F.counts
* View new levels
mata: F.levels[1::10]
* Store back new levels (on the same sample)
mata: F.store_levels("id")
* Verify that the results are correct
sort id
li turn foreign id in 1/10
{hline 60}
{txt}
{marker example}{...}
{title:Example: operating on levels of each factor}
{pstd}
This example shows how to process data for each level of the factor (like {help bysort}). It does so by combining {cmd:F.sort()} with {help mf_panelsetup:panelsubmatrix()}.
{p_end}
{pstd}
In particular, this code runs a regression for each category of {it:turn}:
{p_end}
{inp}
{hline 60}
clear all
mata:
real matrix reg_by_group(string depvar, string indepvars, string byvar)
{
class Factor scalar F
real scalar i
real matrix X, Y, x, y, betas
F = factor(byvar)
Y = F.sort(st_data(., depvar))
X = F.sort(st_data(., tokens(indepvars)))
betas = J(F.num_levels, 1 + cols(X), .)
for (i = 1; i <= F.num_levels; i++) {
y = panelsubmatrix(Y, i, F.info)
x = panelsubmatrix(X, i, F.info) , J(rows(y), 1, 1)
betas[i, .] = qrsolve(x, y)'
}
return(betas)
}
end
sysuse auto
mata: reg_by_group("price", "weight length", "foreign")
{hline 60}
{text}
{marker example2}{...}
{title:Example: Factors nested within another variable}
{pstd}
You might be interested in knowing if a categorical variable is nested within another, more coarser, variable.
For instance, a variable containing months ("Jan2017") is nested within another containing years ("2017")),
a variable containing counties ("Durham County, NC") is nested within another containing states ("North Carolina"), and so on.
{p_end}
{pstd}
To check for this, you can follow this example:
{p_end}
{inp}
{hline 60}
sysuse auto
gen turn10 = int(turn/10)
mata:
F = factor("turn")
F.nested_within(st_data(., "trunk")) // False
F.nested_within(st_data(., "turn")) // Trivially true
F.nested_within(st_data(., "turn10")) // True
end
{hline 60}
{txt}
{pstd}
You can also compare two factors directly:
{p_end}
{inp}
{hline 60}
mata:
F1 = factor("turn")
F2 = factor("turn10")
F1.nested_within(F2.levels) // True
end
{hline 60}
{txt}
{marker example3}{...}
{title:Example: Updating a factor after dropping variables}
{pstd}
If you change the underlying dataset you have to recreate the factor, which is costly. As an alternative, you can use {cmd:.keep_obs()} and related methods:
{p_end}
{inp}
{hline 60}
* Benchmark
sysuse auto, clear
drop if price > 4500
mata: F1 = factor("turn")
// Quickly inspect results
mata: F1.num_obs, F1.num_levels, hash1(F1.levels)
* Using F.drop_obs()
sysuse auto, clear
mata
price = st_data(., "price")
F2 = factor("turn")
idx = selectindex(price :> 4500)
mata: F2.num_obs, F2.num_levels, hash1(F2.levels)
F2.drop_obs(idx)
mata: F2.num_obs, F2.num_levels, hash1(F2.levels)
assert(F1.equals(F2))
end
* Using the other methods
mata
F2 = factor("turn")
idx = selectindex(price :<= 4500)
F2.keep_obs(idx)
assert(F1.equals(F2))
F2 = factor("turn")
F2.drop_if(price :> 4500)
assert(F1.equals(F2))
F2 = factor("turn")
F2.keep_if(price :<= 4500)
assert(F1.equals(F2))
end
{hline 60}
{txt}
{marker remarks}{...}
{title:Remarks}
{pstd}
All-numeric and all-string varlists are allowed, but
hybrid varlists (where some but not all variables are strings) are not possible
due to Mata limitations.
As a workaround, first convert the string variables to numeric (e.g. using {cmd:fegen group()}) and then run your intended command.
{pstd}
You can pass as {varlist} a string like "turn trunk"
or a tokenized string like ("turn", "trunk").
{pstd}
To generate a group identifier, most commands first sort the data by a list of keys (such as {it:gvkey, year}) and then ask if the keys differ from one observation to the other.
Instead, {cmd:ftools} exploits the insights that sorting the data is not required to create an identifier,
and that once an identifier is created, we can then use a {it:counting sort} to sort the data in {it:O(N)} time instead of {it:O log(N)}.
{pstd}
To create an identifier (that takes a value in {1, {it:#keys}}) we first match each key (composed by one or more numbers and strings) into a unique integer.
For instance, the key {it:gvkey=123, year=2010} is assigned the integer {it:4268248869} with the Mata function {cmd:hash1}.
This identifier can then be used as an index when accessing vectors, bypassing the need for sorts.
{pstd}
The program tries to pick the hash function that best matches the dataset and input variables.
For instance, if the input variables have a small range of possible values (e.g. if they are of {it:byte} type), we select the {it:hash0} method, which uses a (non-minimal) perfect hashing but might consume a lot of memory.
Alternatively, {it:hash1} is used, which adds {browse "https://www.wikiwand.com/en/Open_addressing":open addressing} to Mata's
{help mf_hash1:hash1} function to create a form of open addressing (that is more efficient than Mata's {help mf_asarray:asarray}).
{marker collapse}{...}
{title:Using the functions from {it:fcollapse}}
{pstd}
You can access the {cmd:aggregate_*()} functions so you can collapse information without resorting to Stata. Example:
{inp}
{hline 60}
sysuse auto, clear
mata: F = factor("turn")
mata: F.panelsetup()
mata: y = st_data(., "price")
mata: sum_y = aggregate_sum(F, F.sort(y), ., "")
mata: F.keys, F.counts, sum_y
* Benchmark
collapse (sum) price, by(turn)
list
{hline 60}
{txt}
Functions start with {cmd:aggregate_*()}, and are listed {view fcollapse_functions.mata, adopath asis:here}
{marker experimental}{...}
{title:Experimental/advanced functions}
{p 8 16 2}
{it:real scalar}
{bind: }{cmd:init_zigzag(}{it:F1}{cmd:,}
{it:F2}{cmd:,}
{it:F12}{cmd:,}
{it:F12_1}{cmd:,}
{it:F12_2}{cmd:,}
{it:queue}{cmd:,}
{it:stack}{cmd:,}
{it:subgraph_id}{cmd:,}
{it:verbose}{cmd:)}
{pstd}Notes:
{synoptset 3 tabbed}{...}
{synopt:- }Given the bipartite graph formed by F1 and F2,
the function returns the number of disjoin subgraphs (mobility groups){p_end}
{synopt:- }F12 must be set with levels_as_keys==1{p_end}
{synopt:- }For F12_1 and F12_2, you can set save_keys==0{p_end}
{synopt:- }The function fills three useful vectors: queue, stack and subgraph_id{p_end}
{synopt:- }If subgraph_id==0, it the id vector will not be created{p_end}
{marker source}{...}
{title:Source code}
{pstd}
{view ftools.mata, adopath asis:ftools.mata};
{view ftools_type_aliases.mata, adopath asis:ftools_type_aliases.mata};
{view ftools_main.mata, adopath asis:ftools_main.mata};
{view ftools_bipartite.mata, adopath asis:ftools_bipartite.mata}
{view fcollapse_functions.mata, adopath asis:fcollapse_functions.mata}
{p_end}
{pstd}
Also, the latest version is available online: {browse "https://github.com/sergiocorreia/ftools/source"}
{marker author}{...}
{title:Author}
{pstd}Sergio Correia{break}
{break}
{browse "http://scorreia.com"}{break}
{browse "mailto:sergio.correia@gmail.com":sergio.correia@gmail.com}{break}
{p_end}
{marker project}{...}
{title:More Information}
{pstd}{break}
To report bugs, contribute, ask for help, etc. please see the project URL in Github:{break}
{browse "https://github.com/sergiocorreia/ftools"}{break}
{p_end}
{marker acknowledgment}{...}
{title:Acknowledgment}
{pstd}
This project was largely inspired by the works of
{browse "http://wesmckinney.com/blog/nycpython-1102012-a-look-inside-pandas-design-and-development/":Wes McKinney},
{browse "http://www.stata.com/meeting/uk15/abstracts/":Andrew Maurer}
and
{browse "https://ideas.repec.org/c/boc/bocode/s455001.html":Benn Jann}.
{p_end}