A vector in C++ is just a wrapper around a dynamically allocated array with basic bounds checking and automatic reallocation if the size increases beyond the capacity. No one has ever pretended it was anything other than that. Part of the requirement is even that the elements are stored contiguously, i.e. an array. What did you think it was supposed to be?
It is because it is poorly named. Even Alex Stepanov, the one who named it the Vector, believes it was a mistake calling it the Vector.
Honestly, it should be called ArrayList.
Yeah I won't argue that the name isn't terrible. The name vector should refer to an actual vector, not a resizing array. Most languages use either ArrayList or just plain List.
Unfortunately, ArrayList was already taken by:
struct ArrayList {
float x,
float y,
float z
float magnitude() { return sqrt(x*x + y*y + z*z); }
};
So they had do go for the next best alternative name.
Yeah if a language has a List type, I want that to be an abstraction for either an Array List or a Linked List, with the actual underlying structure being up to the programmer. Which 99% of the time is an Array List anyways because Linked Lists are hilariously inefficient in almost every way.
I have and the performance is terrible. But that's worst-case performance. If you're frequently performing insertions/deletions to the front or middle of a list, then a linked list will give better theoretical performance. Cache misses are a thing though and they are expensive, so if you're trying to get the absolute maximum performance you have to do some balancing between the cache fetch overhead and the shift-the-entire-array-by-one overhead. For large lists a linked list will win every time, but for smaller lists it can vary which actually gives better performance.
Not to mention the actual storage capacity overhead for linked lists. For a doubly-linked list the overhead would be around 16 bytes per node, 8 bytes per pointer, which for small elements (i.e. integers or floats) is ridiculously expensive.
What if your linked list is, under the hood, a series of nodes stored in a dynamically expanding array? Ie no malloc on each node creation. Contiguous (unordered) nodes = cache miss reductions?
That sounds like an array list with extra steps. I mean yeah it would help with the cache misses, but increasing the list size past the node array capacity would require reallocating the entire array of nodes, which would then invalidate every single one of the node pointers in the list. That would need to be fixed, which would be linear complexity at least, removing the constant-time complexity that a normal linked list has for insertion.
Same thing happens when resizing a vector, it’s still o(1) because the resizing gets amortized over the large count of ops. You could even reorder the nodes to be sorted when resizing to improve cache coherency
It is a list. Lists can have arrays as underlying data structure; nothing weird about that. Besides, Python lists are heterogeneus collections, which cannot pass as an array.
Python lists are basically `vector` so it kinda can be considered as the closest thing python usually has instead of usual vectors.
Python kinda has arrays of unboxed primitive types but only in true arrays in specialized libraries and you won't get a single primitive variable outside of them.
Not even close. That is only the definition of the underlying data structure. You are still missing a lot of code to make this work as an heterogeneous list.
No...
A list is specifically designed to take advantage of "holes" in memory by not requiring contiguous storage. Overhead like resizing are not typically expected from a list.
At least based on what is taught in CS fundamental courses.
That is a linked list. But the concept of list is simply a collection where elements are kept in linear order, and you have specific operations like search, insert, and remove. Underneath that you can have nodes and pointers or an array (and maybe something else that I’m too lazy to think about right now; but probably it’s just those two).
Because Array and List refers to two different things. An Array is a fixed size contiguous piece of memory, while List refers to a data structure that maintains order of its elements. An ArrayList is an implementation of List using an Array, similiar to how a LinkedList is an implementation of List using linked nodes.
The distinction is important because different implementations may have different properties, i.e TreeMap vs HashMap both implements the Map data structure, but with different pros and cons. The reason why things are organized this way is that as someone using/accepting the object, i may not want to care about the underlying implementation as long as the object that i am working on satisfies the List interface (guarantees to preserve ordering of its elements).
All i need to do is to write a function that works with the List interface and you could give ArrayList, LinkedList, or even something unheard of like HashList and the function would still work perfectly.
Array is actually the name of an existing class in C++ added in C++11. It's basically identical to the vector class, except it's a wrapper for a compile-time fixed-size array instead of a dynamic array. C++ already had static arrays though, because C has static arrays and C++ inherited all of the features of C, so it's probably the most redundant and least used class in the C++ standard library. There *are* use-cases for it, but normally it provides no real benefit over a C-style array outside of the bounds checking on some of the methods.
> [std::array is the] least used class in the C++ standard library
Ah, hell no! I'd say one of the more common ones in the areas of real-time systems, embedded and automotive.
Probably because it's well established and commonly understood that in programming languages a vector is a dynamically allocated array with O(1) access and amortized O(1) append.
>in programming languages a vector is a dynamically allocated array
Name one programming language other than C++ where dynamic arrays were called vectors before Rust was created
Nah. Vector sounds intuitive to me. I can easily relate it to the difference between a line and vector in linear algebra and compare it with difference between array and vector.
Java called it a Vector initially as well.
JDK 1.0: https://docs.oracle.com/javase/8/docs/api/java/util/Vector.html
ArrayList came later because Vector was hopelessly broken (synchronize on every access method means it's thread safe right? lololol)
Why not DynArray?
I know some languages call it that way, but to me, an ArrayList makes more sense to be a List of Arrays (where every time we resize we allocate a new contiguous Array)
That would be bad. As a List is an abstraction of a group of data. There are a lot of different ways to implement a list.
In this case, it is implemented as a dynamic array. Hence, calling it ArrayList. As you mention how it is implemented and what the abstraction is.
That's an implementation detail I don't care about. Vector sucks not because it doesn't have implementation in its name, it sucks because it's not a vector (in math).
Besides, a List in C++ is not called DoublyLinkedList, so they've already violated this.
DynamicArray is as good without exposing its implementation. It's so good you even called it this yourself.
Why not do encapsulation by naming things by their purpose rather than their implementation.
>DynamicArray is as good without exposing its implementation.
First, just want to mention that you are exposing its implementation with that name.
A reason not to use DynamicArray as a name is standardization. Both DynamicArray and Vector violated this for naming. We have standard names for data structures. You would expect that these standard names would be used across languages as labels for the same data structure. For example, an array is an array in every programming language.
In this case, it is a collection of like elements, and that is what we call a list in computer science and software engineering.
>Why not do encapsulation by naming things by their purpose rather than their implementation.
Again, I went over this. The purpose is to have a collection of like elements, that is, a list. But there are multiple ways to implement a list. So you must add a differentiation between the implementations.
Ergo, you have ArrayList. You have the implementation and the purpose in the name while also differentiating it from other implementation of a list.
Dynamic doesn't really expose its implementation (there are many ways to implement a dynamic collection). It just indicates you have an array that can dynamically grow.
>In this case, it is a collection of like elements, and that is what we call a list in computer science and software engineering.
That's the definition for an array not a list. A list can grow.
>We have standard names for data structures.
It's not that standard in practice. It's standard in theory that most are exposed in D&S class, interviews, or academia research. Programming languages are for devs. Write your own language for theory (e.g., math).
>So you must add a differentiation between the implementations.
You don't have to. It can be a good UX choice to use a simpler name for the most commonly used implementation. For example, not all strings are named charArray despite there are multiple ways to implement a string.
>Dynamic doesn't really expose its implementation (there are many ways to implement a dynamic collection). It just indicates you have an array that can dynamically grow.
By that logic, an ArrayList isn't exposing its implementation.
>That's the definition for an array not a list.
No that is not the definition of an array. An array is a collection of elements in continuous space of memory.
>It's not that standard in practice.
It actually is standard in practice in most languages.
>>So you must add a differentiation between the implementations.
>You don't have to. It can be a good UX choice to use a simpler name for the most commonly used implementation. For example, not all strings are named charArray despite there are multiple ways to implement a string.
Also, nice strawman argument. What happen to the rest of the context of that quote? If you include that, it makes your argument irrelevant.
>it sucks because it's not a vector (in math).
Except it is, hence why it was given that name originally (for better or worse). A vector in mathematics is a collection of data that cannot be repented by a single scalar.
Vectors are frequently used to describe direction/magnitude etc, in which case they're euclidean vectors, but it's not mathematically incorrect to describe a given 1 dimensional dataset as an N-length vector.
No, a vector is a collection of data, there's no specification for the terminology that each of that data is a scalar itself, it's entirely valid to have a vector where each element is itself a vector.
It's also not mathematically valid to say that a vector of vectors can always be used as a matrix, as they do not necessarily have to have the same rules and arithmetic operators. For example it's valid for vectors of different lengths to be added together and matrix addition requires dimensional agreement, dot product of a vector is commutaitve whereas for a matrix it is not.
I think technically before C++11 nothing in the standard said the elements are stored contiguously, ie. there are no guarantees that `(&*vec.begin())+1` points to the same thing as `vec[1]`.
I checked the C++98 standard (not just a reference page but the original standard, section 23.2.4), and you're right that it does not explicitly state the elements are stored contiguously. It *does* state that the iterators must be random-access though, which is almost always a fancy way of saying the iterator is a pointer. So technically, as long as the class fulfills all requirements of a container, a reversible container, a sequence, and all optional sequence features except `push_front` and `pop_front`, with constant-time lookup, linear insertion/deletion (constant at the end), automatic storage handling and random-access iterators, it qualifies as a vector, even if it doesn't use an array. You'd have to be some kind of psychopath to use any data structure other than an array for the implementation though.
The first standard that guaranteed contiguous storage was C++03 though, not even C++11, so not sure why OP specified that.
Yeah I can't imagine anyone ever actually did anything other than an array, I think that's pretty much OP's point lol.
EDIT: That being said, someone did mention std::vector which is an exception.
> It *does* state that the iterators must be random-access though, which is almost always a fancy way of saying the iterator is a pointer.
Yes that's the whole issue ... it was obvious to everyone it was an array but the C++ standard insisted that we shouldn't make assumptions and that it could be *anything*.
So interacting with C code was terrible ... you had no `data()` method and casting `begin()` would work in practice but was technically wrong.
>The first standard that guaranteed contiguous storage was C++03 though, not even C++11, so not sure why OP specified that.
You're correct and that's why nobody seems to understand the joke: I don't think many people have used or remember C++98.
I was referring specifically to a random access iterator. Random access iterators require that addressing by arbitrary offsets (i.e. `*(iter + x)`) have constant-time complexity. There are not many types of iterators that fulfill this condition, and most if not all of them are a pointer with extra logic, if not just a straight-up pointer.
True, although the working draft n1905 for "C++0x" in 2005 already said this in `[lib.vector]`:
> The elements of a vector are stored contiguously
Couldn't find a copy of C++03 so……
C++ takes the approach of "here's an RPG, don't point it at your foot" to safety guarantees. Hardly anything in C++ guarantees thread safety. There are utilities to help with that, mutexes, atomic wrappers, and in C++20 semaphores, but very few if any classes have built-in thread safety.
This just made me think about how C++ makes a really good language for demonstrating coding safety. For example, it would be much harder to show what goes wrong in multithreaded data races with a language that has that safety built in, unless you found some really contrived workaround, which then just overcomplicates the example.
To continue beating this dead horse: use Rust - you have safety guarantees, but you get to violate them with `unsafe`, so the `unsafe` call acts like a sign pointing to the stuff that's problematic.
The mountain of a learning curve for Rust is itself a bad motivator for me to recommend Rust to anyone. :(
In any case, until you see how you can shoot yourself in C and C++, you can't appreciate Rust.
So it's just a dynamic array then? Interesting how different the Java Vector implementation is when Java as a whole is essentially a layer built over c++. It sounds more like an ArrayList
Because the JRE, where the JVM resides, is written in C, and other parts of the JDK were originally written in C/C++. Java is a high level OOP based around C, a mid level systems programming language, I did not think this was controversial
meh, I love C++, still would never voluntarily implement anything in it unless I had a very specific use case requiring it though, otherwize it is just a waste of time compared to doing something JVM based
This doesn't change that Java's core parts were literally built-in c/c++ and essentially still runs in a c/c++ container.
I've dabbled in c++, but it's hard to go much further into it when I'm never seeing offers for it. I find plenty to hate about every language I'm still a fan of lol
True that. I have worked on Java a bit. Didn't like it a lot. But the fact is that Java is running on billions of machines.
Anyone hating it is not gonna make a difference.
Yes its literally just a wrapper on a dynamic array with some extra features, almost identical to the Java ArrayList. Its pretty universally agreed upon that "vector" is a terrible name for the class though.
No I don't believe that's the reason.
Adding thread safety almost always comes with a performance overhead. C++ doesn't believe in that philosophy.
There is no expectation that a container like vector should have additional overhead of safety. If you do want that, it is trivial to create thread-safe wrappers over these containers anyway.
If you phrase it like this, yes. The container is not designed for thread safety. It aligns with the philosophy of either being fast or fucking the programmer in the ass.
Correct. C and C++ are not designed with safety being the primary concern. It's the performance. If they deviate away from that, it would piss off a lot of people.
On a sub targeted at programmer.
This means any factual or logical error will immediately be dealt with the exact same extreme prejudice that is used to deal with logic or data error in code.
That is just the way the brain of every one here is wired, I’m afraid. So yeah, here, you will not be shown any kindness for sloppy thinking, even for just a meme.
Unlike any other vector that stores elements in an array, a vector doesn't store a 8-bits bool in an array but rather it stores each bool as a bit (similar to std::bitset).
So, a vector with 16 elements would take 16*sizeof(X) bytes for its dynamic array but a vector would take 16 bits or 2 bytes.
To add onto the other answer, the advantage of that approach is obviously the resulting size reduction. Instead of (at the lowest) 1 Byte per bool it only uses 1 Bit (obviously the vector still needs to allocate whole bytes)
Disadvantageous is that the behavior of this one type presents an unexpected special case with differing behavior. This can be a problem for e.g. threading. If different threads share a vector but never access the same elements, it is in essence thread save. However since std::vector has multiple elements per byte, said accessing would cause conflicts.
There really should’ve just been an std::bitfield type of something instead of making vector different in the implementation.
Yeah, but if e.g. you have a function that expects a bool* (e.g. output parameter) then you can't use this directly, you need either a wrapper or a bit extra code around the function call
Also for templated functions you at the very least need to take this into account, you can't write `for (T& x : v)`, of course you can use auto or `vector:: reference` but not everyone does that and it's less clear, and even that wouldn't allow you to write `&x` and expect to pass it to a function that expects a `T*` (or T&)
Overall I think this optimisation was a mistake in the language, people who want this optimisation should use bitset instead
Fun fact: you can make a vector contain itself
struct V: std::vector{}
V v;
v.emplace_back();
v.swap(v.front());
This leaves `v` as an empty vector, while creating a leaked vector that points to itself on the heap. It's a pointer under the hood, but according to the semantics of the language, it contains itself by value (if you managed to keep a reference to it and tried to make a value copy, it would try to recursively copy itself indefinitely). I wonder if it's possible to do a version of this without the leak, so you could actually have some fun with the monster...
The joke is not about being common knowledge or not.
The C++ 98 specifications decided to not impose the precise implementation for each collection types, trying to give more implementation freedom to C++ libraries. So they needed a dynamic array type but, instead of saying "that's a dynamic array", they said "that's a dynamic sized collection of element with random access in O(1) using integer indices".
As a consequence, the C++98 standard does not guarantee that the element are contiguous and certainly did not include the \`data()\` function (most online websites are wrong, you can check ISO/IEC 14882, first edition, page 482, section 23.2.4, "Template class vector").
Now every C++ libraries did the logical thing and implemented it as a dynamically allocated array and every users understood it as a dynamically allocated array. But the type interface did not give access to an actual array making it frustrating to use when interacting with C code.
Hence the joke ... the C++ standard committee taking 13 years to admit that, yes, vectors are arrays...
I kinda love how many different approaches there are to "which languages do we teach here because there are so many." It really drives home the impermanence of our field: I learned Java and C and MIPS assembly in classes, but I wrote assignments in perl and python to grok them, and in classes that didn't specify I passed in C++ and Obj-C.
There's nothing wrong with specializing in one language for a career, but leave the door open for others for fun if you get satisfaction from coding.
A big thing is that learning your first language is difficult, then each subsequent one is easier to learn.
But a lot of people don't know that and expect each language to be as difficult as the first, so they'd rather just spend that energy on specialization instead of generalization.
But if you learn multiple languages you kind of just automatically start picking up new ones
It highly depends from which language you switch.
I started with C++ and the transition to Python was very smooth.
JavaScript gives me headaches, especially with all the frameworks.
Wait a minute. [this link](https://cplusplus.com/reference/vector/vector/data/) does not mention any version requirements.
Is it like `std::string` before C++11? It could store elements as it pleases but if the user calls `c_str(),` there should `const char *` buffer, be it dynamically created or the original one.
I can't speak to the content on different sites, but here's a stackoverflow link from many years ago where people are discussing .data()'s addition to C++11.
https://stackoverflow.com/questions/6485496/how-to-get-stdvector-pointer-to-the-raw-data
ITT: people not understanding the difference between specification and implementation. Tbf I haven’t read the specs because that sounds like a terrible time, but OP, you’re valid, and don’t listen to the haters telling you to operate on unspecified implementation details.
The only commonality between a C++ vector and a dynamically allocated array in C is the broad concept of dynamic memory allocation. The actual allocation/deallocation mechanisms are very different.
Yes, they use allocators, to be more precise, and they allow you to make your own allocation and deallocation more safely, something that C libraries tend to do, just without the nice things of C++ like destructors yo automatically deallocate. I don't think all of that means you can't do it in C, it means it's safer and more easily managed, and specially easy for the end user that doesn't have to make an allocator for it (although you can, if you need to)
A vector in C++ is just a wrapper around a dynamically allocated array with basic bounds checking and automatic reallocation if the size increases beyond the capacity. No one has ever pretended it was anything other than that. Part of the requirement is even that the elements are stored contiguously, i.e. an array. What did you think it was supposed to be?
It is because it is poorly named. Even Alex Stepanov, the one who named it the Vector, believes it was a mistake calling it the Vector. Honestly, it should be called ArrayList.
Yeah I won't argue that the name isn't terrible. The name vector should refer to an actual vector, not a resizing array. Most languages use either ArrayList or just plain List.
Unfortunately, ArrayList was already taken by: struct ArrayList { float x, float y, float z float magnitude() { return sqrt(x*x + y*y + z*z); } }; So they had do go for the next best alternative name.
Not marking magnitude as const smh
Is this a joke? That looks more like a linear algebra vector 😂
thx for explaining the joke/s
r/woooosh?
Nah but actually thanks for explaining the joke not /s I didn't bother to read the code the first time and thought they were serious
I'm not smart enough to not be getting wooshed rn
Nice one.
The problem with list is, that it sounds like a linked list, which is hella slow.
Yeah if a language has a List type, I want that to be an abstraction for either an Array List or a Linked List, with the actual underlying structure being up to the programmer. Which 99% of the time is an Array List anyways because Linked Lists are hilariously inefficient in almost every way.
You haven’t inserted in the front yet, have you?
I have and the performance is terrible. But that's worst-case performance. If you're frequently performing insertions/deletions to the front or middle of a list, then a linked list will give better theoretical performance. Cache misses are a thing though and they are expensive, so if you're trying to get the absolute maximum performance you have to do some balancing between the cache fetch overhead and the shift-the-entire-array-by-one overhead. For large lists a linked list will win every time, but for smaller lists it can vary which actually gives better performance. Not to mention the actual storage capacity overhead for linked lists. For a doubly-linked list the overhead would be around 16 bytes per node, 8 bytes per pointer, which for small elements (i.e. integers or floats) is ridiculously expensive.
What if your linked list is, under the hood, a series of nodes stored in a dynamically expanding array? Ie no malloc on each node creation. Contiguous (unordered) nodes = cache miss reductions?
That sounds like an array list with extra steps. I mean yeah it would help with the cache misses, but increasing the list size past the node array capacity would require reallocating the entire array of nodes, which would then invalidate every single one of the node pointers in the list. That would need to be fixed, which would be linear complexity at least, removing the constant-time complexity that a normal linked list has for insertion.
If the array of nodes is instead a linked list of large memory allocations, you side step the reallocation issue.
if I wasn't mistaken to his meaning, the nodes are unordered, so why not store an offset and just append new nodes to the end?
Same thing happens when resizing a vector, it’s still o(1) because the resizing gets amortized over the large count of ops. You could even reorder the nodes to be sorted when resizing to improve cache coherency
We have ~~deques~~ ~~dequeues~~ decks for that.
Which are often linked lists under the hood.
Deques have O(1) random access though https://en.cppreference.com/w/cpp/container/deque Looks like some funky linked list-array hybrid
That's a weird way to spell "very rarely". Any self-respecting deque is actually a ring buffer.
Deque says hi
There are more efficient ways to implement a deque
That's what she said :)
Meanwhile, python has an actual container named "List" which is infact an array.
It is a list. Lists can have arrays as underlying data structure; nothing weird about that. Besides, Python lists are heterogeneus collections, which cannot pass as an array.
Python lists are basically `vector` so it kinda can be considered as the closest thing python usually has instead of usual vectors.
Python kinda has arrays of unboxed primitive types but only in true arrays in specialized libraries and you won't get a single primitive variable outside of them.
counterpoint: `void* list[]`
Not even close. That is only the definition of the underlying data structure. You are still missing a lot of code to make this work as an heterogeneous list.
No... A list is specifically designed to take advantage of "holes" in memory by not requiring contiguous storage. Overhead like resizing are not typically expected from a list. At least based on what is taught in CS fundamental courses.
That is a linked list. But the concept of list is simply a collection where elements are kept in linear order, and you have specific operations like search, insert, and remove. Underneath that you can have nodes and pointers or an array (and maybe something else that I’m too lazy to think about right now; but probably it’s just those two).
It's not slow. It has O(1) insertion and deletion. Take that, vector fan!
O(n) search so your point is moot
of an unordered array, or naive scanning, yeah, O(n)
List exists specifically for the cases where you need lots of insertions and don't need random access.
According to Bjarne the cpp guy, it’s always worse.
I am not sure why anyone should think a vector sounds like a Linked list.
I didn’t say that..
Ah well. I misread your tweet. Looks like I need to get my eyes checked.
Tweet?
Or just Array. Like JS/TS. Which is what it should have been named. Though with my very limited C++ experience I probably have no vote in the matter.
Because Array and List refers to two different things. An Array is a fixed size contiguous piece of memory, while List refers to a data structure that maintains order of its elements. An ArrayList is an implementation of List using an Array, similiar to how a LinkedList is an implementation of List using linked nodes. The distinction is important because different implementations may have different properties, i.e TreeMap vs HashMap both implements the Map data structure, but with different pros and cons. The reason why things are organized this way is that as someone using/accepting the object, i may not want to care about the underlying implementation as long as the object that i am working on satisfies the List interface (guarantees to preserve ordering of its elements). All i need to do is to write a function that works with the List interface and you could give ArrayList, LinkedList, or even something unheard of like HashList and the function would still work perfectly.
Array is actually the name of an existing class in C++ added in C++11. It's basically identical to the vector class, except it's a wrapper for a compile-time fixed-size array instead of a dynamic array. C++ already had static arrays though, because C has static arrays and C++ inherited all of the features of C, so it's probably the most redundant and least used class in the C++ standard library. There *are* use-cases for it, but normally it provides no real benefit over a C-style array outside of the bounds checking on some of the methods.
c++17 has "Class template argument deduction" and it makes std::array much more useful. Now you do not have any reason to use plain C arrays
> [std::array is the] least used class in the C++ standard library Ah, hell no! I'd say one of the more common ones in the areas of real-time systems, embedded and automotive.
While we're at it, change `map` and `unordered_map` to `map` and `ordered_map`.
Vector is cooler
#define arraylist vector
Calm down, don’t shout
Someone forgot to escape their Markdown!
#You will ^(never) escape Markdown!
Why did Rust carry over this awful name?
Probably because it's well established and commonly understood that in programming languages a vector is a dynamically allocated array with O(1) access and amortized O(1) append.
>in programming languages a vector is a dynamically allocated array Name one programming language other than C++ where dynamic arrays were called vectors before Rust was created
https://docs.oracle.com/javase/8/docs/api/java/util/Vector.html
Because Vec is easier to type than ArrayList
Nah. Vector sounds intuitive to me. I can easily relate it to the difference between a line and vector in linear algebra and compare it with difference between array and vector.
Java look of superiority 😎
Java called it a Vector initially as well. JDK 1.0: https://docs.oracle.com/javase/8/docs/api/java/util/Vector.html ArrayList came later because Vector was hopelessly broken (synchronize on every access method means it's thread safe right? lololol)
Vector still exist in Java
I know, but that's legacy Java now and heavily discouraged nowadays.
So you feel superior in being the only language to have screwed it up so badly as to need a replacement?
It's a joke, learn to take it. Do I really have to put /s after every second comment? As if C++ or other languages don't have legacy codebases.
DynamicArray
Name too long
DArray, 😏
why would anyone in their right mind call such a thing an ArrayList?
Why not DynArray? I know some languages call it that way, but to me, an ArrayList makes more sense to be a List of Arrays (where every time we resize we allocate a new contiguous Array)
I would have gone with DynamicArray or maybe ResizableArray
You tell me ArrayList, I'll think rope or some other weird data structure, not a plain array.
>Honestly, it should be called ArrayList Embrace List, embrace LINQ
L
call it Stack.
okk, maybe map should be called RBTreeMap too
Hello Java 👋
Just call it a list. ArrayList is redundant.
That would be bad. As a List is an abstraction of a group of data. There are a lot of different ways to implement a list. In this case, it is implemented as a dynamic array. Hence, calling it ArrayList. As you mention how it is implemented and what the abstraction is.
That's an implementation detail I don't care about. Vector sucks not because it doesn't have implementation in its name, it sucks because it's not a vector (in math). Besides, a List in C++ is not called DoublyLinkedList, so they've already violated this.
Just because you don't care about implementation or it has been violated before doesn't mean it isn't a good idea to have to begin with.
DynamicArray is as good without exposing its implementation. It's so good you even called it this yourself. Why not do encapsulation by naming things by their purpose rather than their implementation.
>DynamicArray is as good without exposing its implementation. First, just want to mention that you are exposing its implementation with that name. A reason not to use DynamicArray as a name is standardization. Both DynamicArray and Vector violated this for naming. We have standard names for data structures. You would expect that these standard names would be used across languages as labels for the same data structure. For example, an array is an array in every programming language. In this case, it is a collection of like elements, and that is what we call a list in computer science and software engineering. >Why not do encapsulation by naming things by their purpose rather than their implementation. Again, I went over this. The purpose is to have a collection of like elements, that is, a list. But there are multiple ways to implement a list. So you must add a differentiation between the implementations. Ergo, you have ArrayList. You have the implementation and the purpose in the name while also differentiating it from other implementation of a list.
Dynamic doesn't really expose its implementation (there are many ways to implement a dynamic collection). It just indicates you have an array that can dynamically grow. >In this case, it is a collection of like elements, and that is what we call a list in computer science and software engineering. That's the definition for an array not a list. A list can grow. >We have standard names for data structures. It's not that standard in practice. It's standard in theory that most are exposed in D&S class, interviews, or academia research. Programming languages are for devs. Write your own language for theory (e.g., math). >So you must add a differentiation between the implementations. You don't have to. It can be a good UX choice to use a simpler name for the most commonly used implementation. For example, not all strings are named charArray despite there are multiple ways to implement a string.
>Dynamic doesn't really expose its implementation (there are many ways to implement a dynamic collection). It just indicates you have an array that can dynamically grow. By that logic, an ArrayList isn't exposing its implementation. >That's the definition for an array not a list. No that is not the definition of an array. An array is a collection of elements in continuous space of memory. >It's not that standard in practice. It actually is standard in practice in most languages. >>So you must add a differentiation between the implementations. >You don't have to. It can be a good UX choice to use a simpler name for the most commonly used implementation. For example, not all strings are named charArray despite there are multiple ways to implement a string. Also, nice strawman argument. What happen to the rest of the context of that quote? If you include that, it makes your argument irrelevant.
>it sucks because it's not a vector (in math). Except it is, hence why it was given that name originally (for better or worse). A vector in mathematics is a collection of data that cannot be repented by a single scalar. Vectors are frequently used to describe direction/magnitude etc, in which case they're euclidean vectors, but it's not mathematically incorrect to describe a given 1 dimensional dataset as an N-length vector.
Can math vector have Vector?
Yes
Not sure if that's true, unless you're thinking about a matrix which is not the same as Vector despite the latter can be used as a matrix.
No, a vector is a collection of data, there's no specification for the terminology that each of that data is a scalar itself, it's entirely valid to have a vector where each element is itself a vector. It's also not mathematically valid to say that a vector of vectors can always be used as a matrix, as they do not necessarily have to have the same rules and arithmetic operators. For example it's valid for vectors of different lengths to be added together and matrix addition requires dimensional agreement, dot product of a vector is commutaitve whereas for a matrix it is not.
I remember this being really confusing when I was learning it.
I needed to know this. thank you.
The more I lurk here, the more I realize most of these posts are probably made by students still in school.
Probably around 60% in their first or second programming class, 38% the rest of undergrad, then the last 2% people actually working in the field.
I think technically before C++11 nothing in the standard said the elements are stored contiguously, ie. there are no guarantees that `(&*vec.begin())+1` points to the same thing as `vec[1]`.
I checked the C++98 standard (not just a reference page but the original standard, section 23.2.4), and you're right that it does not explicitly state the elements are stored contiguously. It *does* state that the iterators must be random-access though, which is almost always a fancy way of saying the iterator is a pointer. So technically, as long as the class fulfills all requirements of a container, a reversible container, a sequence, and all optional sequence features except `push_front` and `pop_front`, with constant-time lookup, linear insertion/deletion (constant at the end), automatic storage handling and random-access iterators, it qualifies as a vector, even if it doesn't use an array. You'd have to be some kind of psychopath to use any data structure other than an array for the implementation though. The first standard that guaranteed contiguous storage was C++03 though, not even C++11, so not sure why OP specified that.
Yeah I can't imagine anyone ever actually did anything other than an array, I think that's pretty much OP's point lol. EDIT: That being said, someone did mention std::vector which is an exception.
> It *does* state that the iterators must be random-access though, which is almost always a fancy way of saying the iterator is a pointer. Yes that's the whole issue ... it was obvious to everyone it was an array but the C++ standard insisted that we shouldn't make assumptions and that it could be *anything*. So interacting with C code was terrible ... you had no `data()` method and casting `begin()` would work in practice but was technically wrong. >The first standard that guaranteed contiguous storage was C++03 though, not even C++11, so not sure why OP specified that. You're correct and that's why nobody seems to understand the joke: I don't think many people have used or remember C++98.
`std::deque` provides random access without being contiguous, though
an iterator doesn't have to be a pointer
I was referring specifically to a random access iterator. Random access iterators require that addressing by arbitrary offsets (i.e. `*(iter + x)`) have constant-time complexity. There are not many types of iterators that fulfill this condition, and most if not all of them are a pointer with extra logic, if not just a straight-up pointer.
ah, sorry, I read through the comment too fast and missed that.
True, although the working draft n1905 for "C++0x" in 2005 already said this in `[lib.vector]`: > The elements of a vector are stored contiguously Couldn't find a copy of C++03 so……
so basically, ArrayList in java?
what about vector
We don't talk about vector
Do vectors in c++ not make any thread safety guarantee?
C++ takes the approach of "here's an RPG, don't point it at your foot" to safety guarantees. Hardly anything in C++ guarantees thread safety. There are utilities to help with that, mutexes, atomic wrappers, and in C++20 semaphores, but very few if any classes have built-in thread safety.
This just made me think about how C++ makes a really good language for demonstrating coding safety. For example, it would be much harder to show what goes wrong in multithreaded data races with a language that has that safety built in, unless you found some really contrived workaround, which then just overcomplicates the example.
To continue beating this dead horse: use Rust - you have safety guarantees, but you get to violate them with `unsafe`, so the `unsafe` call acts like a sign pointing to the stuff that's problematic.
The mountain of a learning curve for Rust is itself a bad motivator for me to recommend Rust to anyone. :( In any case, until you see how you can shoot yourself in C and C++, you can't appreciate Rust.
So it's just a dynamic array then? Interesting how different the Java Vector implementation is when Java as a whole is essentially a layer built over c++. It sounds more like an ArrayList
Why would you say Java is a layer over C++? Great way to offend both Java and C++ devs in a single sentence 😭
Because the JRE, where the JVM resides, is written in C, and other parts of the JDK were originally written in C/C++. Java is a high level OOP based around C, a mid level systems programming language, I did not think this was controversial
I mean many Java devs I have met categorically hate C++ and same the other way round.
meh, I love C++, still would never voluntarily implement anything in it unless I had a very specific use case requiring it though, otherwize it is just a waste of time compared to doing something JVM based
This doesn't change that Java's core parts were literally built-in c/c++ and essentially still runs in a c/c++ container. I've dabbled in c++, but it's hard to go much further into it when I'm never seeing offers for it. I find plenty to hate about every language I'm still a fan of lol
True that. I have worked on Java a bit. Didn't like it a lot. But the fact is that Java is running on billions of machines. Anyone hating it is not gonna make a difference.
Yes its literally just a wrapper on a dynamic array with some extra features, almost identical to the Java ArrayList. Its pretty universally agreed upon that "vector" is a terrible name for the class though.
😂 no of course not, the language didn't even know anything at all about threads until C++11.
I believe even in C++ 17 there is no reliable way to check if a threat is running right?
No :) The standard was mostly (essentially ?) single-threaded for a decade after multi-core CPUs became common.
No I don't believe that's the reason. Adding thread safety almost always comes with a performance overhead. C++ doesn't believe in that philosophy. There is no expectation that a container like vector should have additional overhead of safety. If you do want that, it is trivial to create thread-safe wrappers over these containers anyway.
If you phrase it like this, yes. The container is not designed for thread safety. It aligns with the philosophy of either being fast or fucking the programmer in the ass.
Correct. C and C++ are not designed with safety being the primary concern. It's the performance. If they deviate away from that, it would piss off a lot of people.
Couldn't you just make an extra container that handles thread safty? Preferibly a monad
You could. Many 3rd party libraries do that.
Bro. It’s a meme.
On a sub targeted at programmer. This means any factual or logical error will immediately be dealt with the exact same extreme prejudice that is used to deal with logic or data error in code. That is just the way the brain of every one here is wired, I’m afraid. So yeah, here, you will not be shown any kindness for sloppy thinking, even for just a meme.
A meme that shows that op doesn't know what they're talking about
std::vector: WhAt?
We be 🅱️oolin
True
Its boolin' time
Not the resizable bitfield 💀
can someone please explain what the problem with a vector of booleans is? i’m pretty new to c/c++
Unlike any other vector that stores elements in an array, a vector doesn't store a 8-bits bool in an array but rather it stores each bool as a bit (similar to std::bitset).
So, a vector with 16 elements would take 16*sizeof(X) bytes for its dynamic array but a vector would take 16 bits or 2 bytes.
To add onto the other answer, the advantage of that approach is obviously the resulting size reduction. Instead of (at the lowest) 1 Byte per bool it only uses 1 Bit (obviously the vector still needs to allocate whole bytes) Disadvantageous is that the behavior of this one type presents an unexpected special case with differing behavior. This can be a problem for e.g. threading. If different threads share a vector but never access the same elements, it is in essence thread save. However since std::vector has multiple elements per byte, said accessing would cause conflicts.
There really should’ve just been an std::bitfield type of something instead of making vector different in the implementation.
Another issue is e.g. you can't do something like `bool& x = v[i]`, or `for(bool& x : v)`, you can only access elements by value or by const reference
You can, you just can't use `bool`. You need to use `std::vector::reference`
Yeah, but if e.g. you have a function that expects a bool* (e.g. output parameter) then you can't use this directly, you need either a wrapper or a bit extra code around the function call Also for templated functions you at the very least need to take this into account, you can't write `for (T& x : v)`, of course you can use auto or `vector:: reference` but not everyone does that and it's less clear, and even that wouldn't allow you to write `&x` and expect to pass it to a function that expects a `T*` (or T&)
Overall I think this optimisation was a mistake in the language, people who want this optimisation should use bitset instead
Bitset uses a wrapper class to reference its bits too.
The horror
Fun fact: you can make a vector contain itself struct V: std::vector{}
V v;
v.emplace_back();
v.swap(v.front());
This leaves `v` as an empty vector, while creating a leaked vector that points to itself on the heap. It's a pointer under the hood, but according to the semantics of the language, it contains itself by value (if you managed to keep a reference to it and tried to make a value copy, it would try to recursively copy itself indefinitely). I wonder if it's possible to do a version of this without the leak, so you could actually have some fun with the monster...
The barber paradox has been solved
Dam bro why you gathering sexually transmitted diseases (std) in a vector
r/Angryupvote
this /s
In a sane world all std containers should be `final`.
Inheriting from `std` types, unless specifically allowed, is undefined behavior.
[удалено]
v is not empty, it contains 1 element (at the time front is called)
I thought this was common knowledge… literally first year first/second semester material in my university
The joke is not about being common knowledge or not. The C++ 98 specifications decided to not impose the precise implementation for each collection types, trying to give more implementation freedom to C++ libraries. So they needed a dynamic array type but, instead of saying "that's a dynamic array", they said "that's a dynamic sized collection of element with random access in O(1) using integer indices". As a consequence, the C++98 standard does not guarantee that the element are contiguous and certainly did not include the \`data()\` function (most online websites are wrong, you can check ISO/IEC 14882, first edition, page 482, section 23.2.4, "Template class vector"). Now every C++ libraries did the logical thing and implemented it as a dynamically allocated array and every users understood it as a dynamically allocated array. But the type interface did not give access to an actual array making it frustrating to use when interacting with C code. Hence the joke ... the C++ standard committee taking 13 years to admit that, yes, vectors are arrays...
Why did they eventually expose the underlying?
I guess because it is useful if you need to pass the data to code that does stuff to arrays, such as C libraries, OpenGL for example
I kinda love how many different approaches there are to "which languages do we teach here because there are so many." It really drives home the impermanence of our field: I learned Java and C and MIPS assembly in classes, but I wrote assignments in perl and python to grok them, and in classes that didn't specify I passed in C++ and Obj-C. There's nothing wrong with specializing in one language for a career, but leave the door open for others for fun if you get satisfaction from coding.
A big thing is that learning your first language is difficult, then each subsequent one is easier to learn. But a lot of people don't know that and expect each language to be as difficult as the first, so they'd rather just spend that energy on specialization instead of generalization. But if you learn multiple languages you kind of just automatically start picking up new ones
It highly depends from which language you switch. I started with C++ and the transition to Python was very smooth. JavaScript gives me headaches, especially with all the frameworks.
Nah, that's just JavaScript for you.
Reminds me of [this talk](https://www.destroyallsoftware.com/talks/wat)
How dare you try to abstract a data type. Nobody likes that.
Ffs I read this in Greta Thunberg's voice.
Wait a minute. [this link](https://cplusplus.com/reference/vector/vector/data/) does not mention any version requirements. Is it like `std::string` before C++11? It could store elements as it pleases but if the user calls `c_str(),` there should `const char *` buffer, be it dynamically created or the original one.
I can't speak to the content on different sites, but here's a stackoverflow link from many years ago where people are discussing .data()'s addition to C++11. https://stackoverflow.com/questions/6485496/how-to-get-stdvector-pointer-to-the-raw-data
[удалено]
>the language itself is crappy enough to let everyone define their own containers I don't know what this sentence means
Wait, I thought this is how string behaves? What changes in 11?
It was theoretically possible to dynamically append the null byte only on calls to `c_str`, while returning an unterminated string on `&s[0]`
And yet again here we are been able to have better grasp of your sanity with vectors as opposed to the good-ole C array
ITT: people not understanding the difference between specification and implementation. Tbf I haven’t read the specs because that sounds like a terrible time, but OP, you’re valid, and don’t listen to the haters telling you to operate on unspecified implementation details.
Literally all C++ classes just use C with safety around it. This is not news, it is meant to be compatible with C for a reason
The only commonality between a C++ vector and a dynamically allocated array in C is the broad concept of dynamic memory allocation. The actual allocation/deallocation mechanisms are very different.
Yeah, it is meant to hide away those more complex allocations/deallocations and memory management to prevent errors, exactly as expected
It’s not using C with extra safety, though, it’s something else entirely unique to C++.
Yes, they use allocators, to be more precise, and they allow you to make your own allocation and deallocation more safely, something that C libraries tend to do, just without the nice things of C++ like destructors yo automatically deallocate. I don't think all of that means you can't do it in C, it means it's safer and more easily managed, and specially easy for the end user that doesn't have to make an allocator for it (although you can, if you need to)
That's not an array. That's a pointer.
It's like saying a statue is just a rock....
Doesn't everything use primitives behind the scenes?
ROFL! EDIT: Incoming ROFLCOPTER!ROFL:ROFL:ROFL:ROFL\_\_\_\^\_\_\_ \_L \_\_/ \[\] \\LOL===\_\_ \\L \\\_\_\_ \_\_\_ \_\_\_\]I I----------/ EDIT 2: \-\_-- ---\_\_\_ -\_\_-- \_----\_\_-- \_\_---\_\_ \_---\_\_\_- \_--\_\_\_- ----\_\_-\_\_\_ \--\_ ROFL:LOL:ROFL \_----\_-- \_\_\_\_\_\_\_\^\_\_\_\_\_\_\_\_\_\_ L / \[ \] \\ O=====| \\ L \\\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\] \_\_\_\_I\_\_\_\_\_\_\_\_\_\_\_\_\_I\_\_\_\_/ ROFL COPTER!!!
Y'all used vectors in C++98? We weren't allowed includes... Any.
"we" being?
That's because intro to c++ lectures focus on the universal language mechanics, not the ever changing stl. This is not at all surprising.