Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pXXXX - Fallible Allocators for Algorithms #14

Open
ThePhD opened this issue Feb 5, 2019 · 13 comments
Open

pXXXX - Fallible Allocators for Algorithms #14

ThePhD opened this issue Feb 5, 2019 · 13 comments
Assignees
Labels
enhancement New feature or request humans [design.lewg] To be mulled over by the sometimes rowdy, generally level-headed, but often weird people in LEWG. night elves [design.lewgi] The dark council standing inbetween you, outlander, and the Winterland.
Milestone

Comments

@ThePhD
Copy link
Owner

ThePhD commented Feb 5, 2019

std::scratch_space is a type that is meant to be passed to algorithms which currently internally and opaquely allocate to gain their complexity guarantees. Three such algorithms in the standard are known:

  • std::stable_sort
  • std::stable_partition
  • std::inplace_merge

For -ffreestanding and similar, we should provide overloads which take this scratch space bucket and work solely within it, to avoid paying the cost of opaque dynamic allocations inside of the algorithms.

@ThePhD ThePhD added the enhancement New feature or request label Feb 5, 2019
@ThePhD ThePhD added this to the C++23 milestone Feb 5, 2019
@ThePhD ThePhD self-assigned this Feb 5, 2019
@ThePhD ThePhD changed the title pXXXX - std::scratch_space for allocating algorithms pXXXX - std::scratch_space for allocating algorithms for C++23 Feb 5, 2019
@ThePhD
Copy link
Owner Author

ThePhD commented Feb 5, 2019

Despite people telling me it exists, I cannot find the overloads of Alexander Stepanov's STL algorithms with anything like a std::scratch_space: http://stepanovpapers.com/STL/DOC.PDF

@Morwenn
Copy link
Collaborator

Morwenn commented Feb 5, 2019

Don't forget that if you provide pre-allocated buffers, people will ask to support allocators too. It is tightly linked to std::temporary_buffer where the decisions was: either augment it to work with allocators or get rid of it. And they got rid of it.

@ThePhD ThePhD added the humans [design.lewg] To be mulled over by the sometimes rowdy, generally level-headed, but often weird people in LEWG. label Jun 10, 2019
@ThePhD
Copy link
Owner Author

ThePhD commented Jul 25, 2019

Some insight: MSVC's STL already has a basically-allocator plus optimistic-stack-allocation internals for this. We'd likely just be standardizing an existing library practice: need to look at __stable_partition in more libraries.

@ThePhD ThePhD added the night elves [design.lewgi] The dark council standing inbetween you, outlander, and the Winterland. label Jul 25, 2019
@Morwenn
Copy link
Collaborator

Morwenn commented Jul 25, 2019

IIRC both libc++ and libstdc++ rely on global operator new to allocate the memory (with varying support for sized deallocation and over-alignment), so it's most likely done on the heap.

Also related: Stepanov mentioned long ago that std::get_temporary_buffer was "currently" inefficient (the good ol' split the buffer size in half until you can allocate something) but that having dedicated low-level memory allocation functions could make it a very interesting solution. As far as I know, such a function was never proposed to WG14 anyway, and thus std::get_temporary_buffer remained forever unoptimized (there is a note about it in either Elements of Programming or From Mathematics to Generic Programming, I would need to find it again, it doesn't mention Howard's proposal).

There could be ways to optimize std::get_temporary_buffer if more low-level memory allocation functions existed:

  • First of all, maybe some memory allocation functions know at a given point how much memory they have left to allocate. I don't know whether it is the case or not, but it would allow the temporary buffer functions to make O(1) allocations instead of O(log n) with the current unoptimized method.
  • Failing that, it could still take advantage of realloc-like functions for regrowth, but let me take that to another comment to make more points.

@Morwenn
Copy link
Collaborator

Morwenn commented Jul 25, 2019

In the previous comment I started to mention regrowth for std::scratch_space: to me it is a major use case of such buffers. In cpp-sort I use it in algorithms such as merge_sort where I don't know how much memory I will need to allocate depending on the pattern, so I tend to reallocate as needed, which was more sensible in my benchmarks than always allocating memory equivalent to half of the collection.

As a practical example to discuss further, here is my temporary_buffer implementation in cpp-sort (unfortunately without allocator support). And here is its try_grow function:

auto try_grow(std::ptrdiff_t count) noexcept
    -> bool
{
    auto tmp = get_temporary_buffer<T>(count, buffer_size);
    if (not tmp.first) {
        // If the allocated buffer isn't bigger, keep the old one
        return_temporary_buffer<T>(tmp.first, tmp.second);
        return false;
    }
    // If the allocated buffer is big enough, replace the previous one
    return_temporary_buffer(buffer, buffer_size);
    buffer = tmp.first;
    buffer_size = tmp.second;
    return true;
}

Most of my algorithms - just like those you are trying to improve - don't actually need a buffer at all to work, but the bigger the buffer, the simpler the algorithm and the faster it should run. This function answers to those specific needs by performing the following operation: it tries to grow the buffer up to a given size, and doesn't allocate anything if it can't. In order to fo this, get_temporary_buffer is actually a reimplementation of the standard library one, except that it takes an additional min_count parameter: it gives up if it can't allocate more than min_count objects.

If better low-level memory allocation functions were provided, then we could most likely improve upon the status quo:

  • It could take advantage of the proposed size feedback from operator new (see P0901R4), which makes sense when you know that you might need to make your buffer grow because it might make your buffer big enough to not regrow at all in some cases.
  • Second, realloc-like functions could try to grow the buffer in-place. I know that a reallocation mechanism is currently proposed for standardization in P0894 (see later comment). Such a function was also proposed to WG14 for standardization by Howard Hinnant in N1085 - Proposal to augment the interface of malloc/free/realloc/calloc under the name negotiate_alloc but it never made it into the C standard, and thus not into the C++ standard either. The proposal was rejected by lack of interest (see next comment). These functions are actually very similar to my try_grow function, except that they are meant for in-place growth.
  • Third, when you want to reallocate but don't care about it happening in-place, I guess that if you've got a memory pattern such as [ free memory | scratch_space current buffer | free memory ], then a smart memory allocator might be able to claim and collapse those in a single region of storage, while with my current implementation I don't deallocate the current buffer before trying to allocate a new one, because in a threaded program I might end up with less memory if some other thread claims memory right after I've deallocated it. It's maybe an issue that a smart memory allocation function might be able to solve?

Also it's worth noting that it's really all about allocating and deallocating memory. There is an implicit contract that when the destructor or try_grow are called, there isn't an object left to destroy in the storage, so we don't care about moving copying or moving elements when a potential reallocation occurs.

It's a bit unstructured, but you now have my thoughts and ruminations about the use cases, potential optimization and assumptions about a potential std::scratch_space design. What's interesting is that the whole gamut of possible memory allocation optimizations can be used to improve these algorithms. My use case basically corresponds to reimplementing inplace_merge and stable_sort, so it should closely match what you would want from a feature designed to implement those algorithms.

@Morwenn
Copy link
Collaborator

Morwenn commented Jul 25, 2019

Concerning N1085, here is an answer by Howard himself:

I did not attend WG14 personally to present this paper. The feedback I got was that there was insufficient interest to move forward with it. I.e. no one present was excited enough about it to do the hard work, and even I, the author, wasn’t willing to shell out the vacation time and travel expense to make it happen.

I.e. good ideas are often not enough by themselves. Someone has to be willing to champion it in person. It takes time and money (the money for travel, nothing nefarious).

On the positive side it means that such memory allocation proposals weren't shot down because of implementability concerns or design issues, but purely out of lack of interest.

@Morwenn
Copy link
Collaborator

Morwenn commented Jul 25, 2019

And I found the realloc paper for C++: P0894. Here is the motivation part:

As we can see, there is no counterpart for realloc(). And there is a solid
reason for that: realloc() copies the memory block if it cannot be just
expanded or shortened. It is unacceptable behaviour because objects in C++
are not trivially copyable in general. Therefore realloc() cannot be used in
generic code. However the ability to resize already allocated memory block can
be very useful. We could potentially get performance and safety gain:
no need to move the existing data (performance) and consequently no risk
to cause an exception by throwing move-operation (safety). Yes, it cannot be
guaranteed that every resize request will be satisfied (it won't usually in
fact). And what do we do in such case? All we do today! Just fall back to the
current technics.

Interestingly enough the way I use the assumption that I mentioned a few comments ago - about not having to copy/move objects and just reallocating storage - might apparently be a good use case for realloc in C++ since I assume that I only manipulate memory which is just raw storage without objects allocated in it.

I think that I'm done commenting and throwing ideas and remarks for now. Sorry for the spam ^^'

@ThePhD
Copy link
Owner Author

ThePhD commented Jul 25, 2019

Nothing so educational could ever be labeled with a word like "spam".

@Morwenn
Copy link
Collaborator

Morwenn commented Jul 25, 2019

I updated my cpp-sort library to take advantage of what I mentioned in my big comment, and restructured said comment a bit to make it clearer and less redundant :)

@ThePhD
Copy link
Owner Author

ThePhD commented Nov 14, 2020

This issue has been subsumed by the need for a fallible allocator, which is separate from a "more complete" allocator (though should receive the same improvements).

@ThePhD ThePhD changed the title pXXXX - std::scratch_space for allocating algorithms for C++23 pXXXXX - Fallible Allocators for Algorithms Nov 14, 2020
@ThePhD ThePhD changed the title pXXXXX - Fallible Allocators for Algorithms pXXXX - Fallible Allocators for Algorithms Nov 14, 2020
@ThePhD
Copy link
Owner Author

ThePhD commented Nov 14, 2020

Depends on #34/p2256.

@h-vetinari
Copy link

image

Why not use a verb? The ranges name spaces is full of them, so why not also std::allocate? I mean, yes, people would probably get mad (because consistency, bikesheds, etc.), but perhaps it's still a less-bad trade-off overall (and might be less painful to keep looking at while you work on it).

Bonus: std::allocate pairs nicely with std::try_allocate; though C++ people would probably blow a gasket because it sounds like it may have been inspired/influenced by rust (the horror!).

@ThePhD
Copy link
Owner Author

ThePhD commented Jul 19, 2023

I could try that. I haven't come across any name that really helps.

Right now it's not too important, though, because one of the things I'm realizing is that this is going to have to coexist with the initial allocator's infrastructure anyways. Might as well just make all the function names be different and let the existence of said functions be the detection mechanism on any given allocator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request humans [design.lewg] To be mulled over by the sometimes rowdy, generally level-headed, but often weird people in LEWG. night elves [design.lewgi] The dark council standing inbetween you, outlander, and the Winterland.
Projects
None yet
Development

No branches or pull requests

3 participants