Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Julia - creating matrices of Union{Nothing,String} vs Union{Nothing,Bool}

Tags:

julia

In a program I have, I want to initialise a bunch of matrices with Nothing and then if some condition is met, change individual elements to a value of type Bool or String

This works fine when I initialise with

Array{Union{Nothing,Bool},2}(undef,5,5)

which yields something looking like

5×5 Matrix{Union{Nothing, Bool}}:
 nothing  nothing  nothing  nothing  nothing
 nothing  nothing  nothing  nothing  nothing
 nothing  nothing  nothing  nothing  nothing
 nothing  nothing  nothing  nothing  nothing
 nothing  nothing  nothing  nothing  nothing

But not when I initialise with

Array{Union{Nothing,String},2}(undef,5,5)

which gives me

5×5 Matrix{Union{Nothing, String}}:
 #undef  #undef  #undef  #undef  #undef
 #undef  #undef  #undef  #undef  #undef
 #undef  #undef  #undef  #undef  #undef
 #undef  #undef  #undef  #undef  #undef
 #undef  #undef  #undef  #undef  #undef

Now I can change values in that second array to Strings so that I get

5×5 Matrix{Union{Nothing, String}}:
 #undef  #undef  #undef  #undef  #undef
 #undef  #undef  #undef  #undef  #undef
 #undef  #undef  #undef  #undef     "Look"
 #undef  #undef  #undef  #undef  #undef
 #undef  #undef  #undef  #undef  #undef

But when I have a second large array constructed as

Array{Union{Nothing, Bool, String}}(undef,10,5)

which looks like

10×5 Matrix{Union{Nothing, Bool, String}}:
 #undef  #undef  #undef  #undef  #undef
 #undef  #undef  #undef  #undef  #undef
 #undef  #undef  #undef  #undef  #undef
 #undef  #undef  #undef  #undef  #undef
 #undef  #undef  #undef  #undef  #undef
 #undef  #undef  #undef  #undef  #undef
 #undef  #undef  #undef  #undef  #undef
 #undef  #undef  #undef  #undef  #undef
 #undef  #undef  #undef  #undef  #undef
 #undef  #undef  #undef  #undef  #undef

I can assign the first 5 rows quite happily to the first array constructed out of Nothing and Bool, but I can't assign to the second matrix constructed out of #undef/nothing and String. Instead, I get this error

UndefRefError: access to undefined reference

Initially I thought it was something to do with converting from type Nothing or #undef to String but I seem to be able to do this when I assign individual strings up above.

Any ideas?

like image 253
Pablo Avatar asked Oct 12 '25 10:10

Pablo


1 Answers

First let me start with a recommendation, what to do.

It is best to create your matrices in general in the following way:

julia> Array{Union{Nothing,String},2}(nothing,5,5)
5×5 Matrix{Union{Nothing, String}}:
 nothing  nothing  nothing  nothing  nothing
 nothing  nothing  nothing  nothing  nothing
 nothing  nothing  nothing  nothing  nothing
 nothing  nothing  nothing  nothing  nothing
 nothing  nothing  nothing  nothing  nothing

In this way you ensure that they are properly initialized.

Now to explain what you observe. #undef as opposed to nothing is not a value. It means that the given cell in a matrix is not connected to any value. You cannot read from such a cell. You must first write to it before you can read it:

julia> x = Vector{String}(undef, 3)
3-element Vector{String}:
 #undef
 #undef
 #undef

julia> x[1]
ERROR: UndefRefError: access to undefined reference
Stacktrace:
 [1] getindex(A::Vector{String}, i1::Int64)
   @ Base .\array.jl:801
 [2] top-level scope
   @ REPL[6]:1

julia> x[1] = "a"
"a"

julia> x
3-element Vector{String}:
    "a"
 #undef
 #undef

julia> x[1]
"a"

You might ask why for the case of Union{Bool, Nothing} element type you get a value in an array, while in the case of Union{String, Nothing} you do get #undef, i.e. no value that you can read.

The answer is that in Julia there are two kinds of types:

  • bits type, for which isbitstype function returns true; such data are immutable and do not contain references to other values; an example of such data is Bool
  • non-bits type, which either are mutable, or contain references; an example of such data is String (as technically string is represented as a reference to some location in memory where the contents of the string is stored)

As you can see here:

julia> isbitstype(Bool)
true

julia> isbitstype(String)
false

Now - if your array is to store a bits type (or their union) then it stores it directly, so there is always some value (there is no guarantee what value it would be but you know you will get a value), e.g.:

julia> Matrix{Int}(undef, 5, 5)
5×5 Matrix{Int64}:
 260255120  260384864  260235344  260254240  0
 260254240  260235344  260235344  260255120  0
 261849744  260235344  260235344  260235344  0
 260465792  260465440  260464224  260235344  0
 260235344  260235344  260235344  260235344  0

and as you can see we have some values stored in it, but it is undefined what the values would be.

On the other hand if your array is to store a non-bits type it actually stores references to the values. Which means that when you create an array of such values without initializing it you get #undef - which means that in this cell there is no reference to a valid value.

Just to show you that it has nothing to do directly with strings, let me show you e.g. the String7 type (it is exported e.g. by CSV.jl), which is a fixed width string (maximum width of 7 bytes) that is a bits type. Observe the difference:

julia> using CSV

julia> isbitstype(String7)
true

julia> Matrix{String7}(undef, 5, 5)
5×5 Matrix{String7}:
 "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0"  "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0"  "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0"  …  "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0"
 "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0"  "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0"  "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0"     "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0"
 "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0"  "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0"  "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0"     "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0"
 "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0"  "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0"  "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0"     "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0"
 "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0"  "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0"  "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0"     "\0\0\0\0\x0f\x86_\x10\0\0\0\0\0\0\0\0"

julia> Matrix{String}(undef, 5, 5)
5×5 Matrix{String}:
 #undef  #undef  #undef  #undef  #undef
 #undef  #undef  #undef  #undef  #undef
 #undef  #undef  #undef  #undef  #undef
 #undef  #undef  #undef  #undef  #undef
 #undef  #undef  #undef  #undef  #undef

In the first case you got a matrix of String7 values that is initialized, but the values it is initialized to are undefined (it is some garbage). In the second case you got a matrix of String values, and since they are non-bits the matrix is uninitialized - it does not hold any values yet. You first have to assign some values before you will be able to read them.

Finally there is a isassigned function that allows you to check if the container has a value associated with some index (i.e. to check if it is not #undef or the index is out of bounds). Here is an example:

julia> x = Vector{String}(undef, 3)
3-element Vector{String}:
 #undef
 #undef
 #undef

julia> x[1] = "a"
"a"

julia> isassigned(x, 1) # we have a value here
true

julia> isassigned(x, 2) # no value
false

julia> isassigned(x, 3) # no value
false

julia> isassigned(x, 4) # out of bounds
false

If anything is unclear please comment and I can expand the answer.

like image 66
Bogumił Kamiński Avatar answered Oct 16 '25 07:10

Bogumił Kamiński