Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ascii: only support checking String for pure ASCIIness #16396

Merged
merged 2 commits into from
May 18, 2016
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 0 additions & 7 deletions base/ascii.jl

This file was deleted.

4 changes: 4 additions & 0 deletions base/deprecated.jl
Original file line number Diff line number Diff line change
Expand Up @@ -1155,6 +1155,10 @@ end
@deprecate_binding UTF8String String
@deprecate_binding ByteString String

@deprecate ascii(p::Ptr{UInt8}, len::Integer) ascii(bytestring(p, len))
@deprecate ascii(p::Ptr{UInt8}) ascii(bytestring(p))
@deprecate ascii(x) ascii(string(x))

@deprecate ==(x::Char, y::Integer) UInt32(x) == y
@deprecate ==(x::Integer, y::Char) x == UInt32(y)
@deprecate isless(x::Char, y::Integer) UInt32(x) < y
Expand Down
21 changes: 3 additions & 18 deletions base/docs/helpdb/Base.jl
Original file line number Diff line number Diff line change
Expand Up @@ -8719,28 +8719,13 @@ alias the input `x` to modify it in-place.
filt!

"""
ascii(::Array{UInt8,1})
ascii(s::AbstractString)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make corresponding change in rst too

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be good to migrate the remaining docstring inline

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't that get automatically done when the RST gets regenerated?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what the script uses to find the correct place to insert the docstring so changing the signature will make the doc generation fail.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The signatures now match...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By inline I mean moving the markdown docstring to the source, so we can gradually get rid of this Base.jl file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You also should have run genstdlib and committed that at the same time, otherwise unrelated doc changes are ending up in multiple other PR's.


Create an ASCII string from a byte array.
"""
ascii(::Vector{UInt8})

"""
ascii(s)

Convert a string to a contiguous ASCII string (all characters must be valid ASCII characters).
Convert a string to `String` type and check that it contains only ASCII data, otherwise
throwing an `ArugmentError` indicating the position of the first non-ASCII byte.
"""
ascii(s)

"""
ascii(::Ptr{UInt8}, [length])

Create an ASCII string from the address of a C (0-terminated) string encoded in ASCII. A
copy is made; the ptr can be safely freed. If `length` is specified, the string does not
have to be 0-terminated.
"""
ascii(::Ptr{UInt8},?)

"""
maxabs(itr)

Expand Down
2 changes: 1 addition & 1 deletion base/printf.jl
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ function parse1(s::AbstractString, k::Integer)
while c in "#0- + '"
c, k = next_or_die(s,k)
end
flags = ascii(s[j:k-2])
flags = String(s[j:k-2])
# parse width
while '0' <= c <= '9'
width = 10*width + c-'0'
Expand Down
4 changes: 2 additions & 2 deletions base/socket.jl
Original file line number Diff line number Diff line change
Expand Up @@ -586,6 +586,7 @@ function uv_getaddrinfocb(req::Ptr{Void}, status::Cint, addrinfo::Ptr{Void})
end

function getaddrinfo(cb::Function, host::String)
isascii(host) || error("non-ASCII hostname: $host")
callback_dict[cb] = cb
status = ccall(:jl_getaddrinfo, Int32, (Ptr{Void}, Cstring, Ptr{UInt8}, Any, Ptr{Void}),
eventloop(), host, C_NULL, cb, uv_jl_getaddrinfocb::Ptr{Void})
Expand All @@ -598,10 +599,9 @@ function getaddrinfo(cb::Function, host::String)
end
return nothing
end
getaddrinfo(cb::Function, host::AbstractString) = getaddrinfo(cb,ascii(host))
getaddrinfo(cb::Function, host::AbstractString) = getaddrinfo(cb, String(host))

function getaddrinfo(host::String)
isascii(host) || error("non-ASCII hostname: $host")
c = Condition()
getaddrinfo(host) do IP
notify(c,IP)
Expand Down
10 changes: 10 additions & 0 deletions base/strings/util.jl
Original file line number Diff line number Diff line change
Expand Up @@ -243,3 +243,13 @@ function bytes2hex(a::AbstractArray{UInt8})
end
return String(b)
end

# check for pure ASCII-ness

function ascii(s::String)
for (i, b) in enumerate(s.data)
b < 0x80 || throw(ArgumentError("invalid ASCII at index $i in $(repr(s))"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have or want a more specific exception type than ArgumentError for this? I guess we have UnicodeError but are you planning on removing that?

end
return s
end
ascii(x::AbstractString) = ascii(convert(String, x))
1 change: 0 additions & 1 deletion base/sysimg.jl
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,6 @@ include("iobuffer.jl")

# strings & printing
include("char.jl")
include("ascii.jl")
include("string.jl")
include("unicode.jl")
include("parse.jl")
Expand Down
4 changes: 2 additions & 2 deletions examples/clustermanager/simple/UnixDomainCM.jl
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ function connect(manager::UnixDomainCM, pid::Int, config::WorkerConfig)
if isa(address, Tuple)
sock = connect(address...)
else
sock = connect(ascii(address))
sock = connect(address)
end
return (sock, sock)
catch e
Expand All @@ -62,7 +62,7 @@ end
function start_worker(sockname, cookie)
Base.init_worker(cookie, UnixDomainCM(0))

srvr = listen(ascii(sockname))
srvr = listen(sockname)
while true
sock = accept(srvr)
Base.process_messages(sock, sock)
Expand Down
2 changes: 1 addition & 1 deletion examples/lru_test.jl
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ using LRUExample
TestLRU = LRUExample.UnboundedLRU{String, String}()
TestBLRU = LRUExample.BoundedLRU{String, String}(1000)

get_str(i) = ascii(vcat(map(x->[x>>4; x&0x0F], reinterpret(UInt8, [Int32(i)]))...))
get_str(i) = String(vcat(map(x->[x>>4; x&0x0F], reinterpret(UInt8, [Int32(i)]))...))

isbounded{L<:LRUExample.LRU}(::Type{L}) = any(map(n->n==:maxsize, fieldnames(L)))
isbounded{L<:LRUExample.LRU}(l::L) = isbounded(L)
Expand Down
18 changes: 10 additions & 8 deletions test/strings/basic.jl
Original file line number Diff line number Diff line change
Expand Up @@ -216,17 +216,13 @@ end
# issue #11142
s = "abcdefghij"
sp = pointer(s)
@test ascii(sp) == s
@test ascii(sp,5) == "abcde"
@test typeof(ascii(sp)) == String
@test utf8(sp) == s
@test utf8(sp,5) == "abcde"
@test typeof(utf8(sp)) == String
s = "abcde\uff\u2000\U1f596"
sp = pointer(s)
@test utf8(sp) == s
@test utf8(sp,5) == "abcde"
@test_throws ArgumentError ascii(sp)
@test ascii(sp, 5) == "abcde"
@test_throws ArgumentError ascii(sp, 6)
@test typeof(utf8(sp)) == String

@test get(tryparse(BigInt, "1234567890")) == BigInt(1234567890)
Expand Down Expand Up @@ -494,7 +490,13 @@ foobaz(ch) = reinterpret(Char, typemax(UInt32))
# `bytestring`, `ascii` and `utf8`
@test_throws ArgumentError bytestring(Ptr{UInt8}(0))
@test_throws ArgumentError bytestring(Ptr{UInt8}(0), 10)
@test_throws ArgumentError ascii(Ptr{UInt8}(0))
@test_throws ArgumentError ascii(Ptr{UInt8}(0), 10)
@test_throws ArgumentError utf8(Ptr{UInt8}(0))
@test_throws ArgumentError utf8(Ptr{UInt8}(0), 10)

# ascii works on ASCII strings and fails on non-ASCII strings
@test ascii("Hello, world") == "Hello, world"
@test typeof(ascii("Hello, world")) == String
@test ascii(utf32("Hello, world")) == "Hello, world"
@test typeof(ascii(utf32("Hello, world"))) == String
@test_throws ArgumentError ascii("Hello, ∀")
@test_throws ArgumentError ascii(utf32("Hello, ∀"))