C++ 람다, 초보자용
Analysis of the lambda expression
this
pointer 이게 다 뭐야
When C++11 came out it came with a lot of goodies, and one of the best additions to the language repertoire were lambda functions.
Most people's first exposure to lambda functions in C++ is in the way of "throwaway functions" scattered around the code, because that's what lambdas are used for most of the time. Take for instance:
auto addition = [](int x, int y) { return x + y; };
std::cout << addition(4, 3) << std::endl;
std::cout << addition(99, 1) << std::endl;
Here a lambda function is created to perform a simple addition between two numbers, and then the lambda gets called a couple of times with different input arguments each time. When execution exits the scope where the lambda object was created, the lambda will be destroyed just like any other local variable.
Lambdas can do more than that, though. Lambdas are executable objects, and just like other objects, they can be moved around, copied, stored in containers, passed as arguments, etc. Lambdas can carry a context with them, so that they can be created in one place and transferred to another for execution. Lambdas can be stateful, and can also be linked to the state of an external object.
It must be said that lambdas don't really bring anything radically new to the C++ table. Everything that can be done with lambdas today could already be done with pre-C++11 versions of C++ using good-old functor objects (objects that can be called like functions), if you bothered to do the effort to write them.
The real advantage of the lambda notation, though, is that it provides a compact way to describe an executable object in such a way that the compiler can do the implementation for us. The programmer can then focus on the what instead of on the how, and explore design possibilities that might have been avoided otherwise
During the rest of the article I'll try to introduce you to how lambda functions get declared, and what the ins and outs of variable capture are.
Please notice that the article is aimed at beginners trying to "get it" and/or intermediate users looking for a refresh. If you already "get" lambdas, you can look try the references in cite at the very end for better material.
Lambda의 전체 이름
Lets start by getting semantics out of the way.
In reality, there's no such thing as a lambda. "Lambda" is just a colloquial term to that, depending on the context, will usually refer to one of three things.
- A lambda expression.
- A closure object.
- A closure class.
All three of them are present in the following sentence:
auto addition = [](int x, int y) { return x + y; };
Here,
-
[](int x, int y) { return x + y; }
is the lambda expression. -
addition
is the closure object. - You can't see the closure class, but it's there in the form of the type of the closure object.
Whenever you hear someone talking about a lambda, they are most likely talking about one of the three above, and in most cases it'll be one of the first two (the third one gets far less press time).
I'll keep it light and talk about lambdas
when it suits me better, but I'll use the terms above when the distinction is important.
람다 식 분석
The lambda expression is the most visible expression of the lambda in the code.
The structure of the lambda expression is
[...capture list...](...parameters...) -> <return type> { ... body ... }
but in practice, it will be a lot simpler than this. In fact, the humblest of all lambda expressions is
[]{}
which defines a lambda that defines captures nothing, takes in nothing as parameters, does nothing, and returns no data; that lambda is, in other words, equivalent to this function:
void f() {}
Just like for function declarations, the code of the lambda is enclosed within the {}
brackets, and the input parameters within the ()
pair. There's not a lot to say about either of these components of the lambda, since they follow both in form and in function from their equivalents on the function declaration.
The ()
of the parameters list can be completely omitted when the lambda is not meant to receive any parameters. That's why the humblest lambda is []{}
and not [](){}
. If a return type is specified, however, the ()
needs to be written even if the parameter list is empty.
I want to talk at length about captures, which is what separates lambdas from just being throwaway functions, but allow me to get something out of the way first and discuss the return type of the lambda first.
람다의 반환 유형
The return value type of the lambda can be omitted from the expression, and most people will not write it. When the type is not present the compiler will infer it from the rest of the expression.
I won't be writing the return type in the rest of the examples in this articles unless I really need it. That's why I wanted to task about this early: it's less typing for me.
Only in a minority of cases you'll see lambda expression that fully state the return types of the lambda. In those cases it will usually be because of one of two reasons:
- The lambda body is such that the return type is ambiguous to the compiler.
- The coder wants to coerce the return type into something different from what the compiler would have inferred.
Whenever the return type is not written in the lambda expression the compiler uses a simple but effective set of rules to determine what type you meant to return:
- If there are no
return
statements, or if the ones present are barereturns
with no return value (such asreturn;
), then the lambda is assumed to returnvoid
. - If there's a single
return
statement and it has a return expression, the return type will be the type that results from evaluating the expression. - If there are multiple
return
statements and all of them have a return expression that evaluates to the very same type, that will be the return type of the lambda. - If none of the above is the case, the compiler will give up and then you'll have to explicitly state the return type.
A few examples:
This declares a lambda that just returns void
:
auto f1 = [](){}; // could be just []{}
Here the return expression evaluates to int
, and therefore the return type of the lambda will be the same type:
auto f3 = [](int n) { return n; };
The return type in the expression here evaluates to bool
, so that'll be the type of the return value:
auto f4 = [](int n) { return n > 5; };
There are two return expressions in the following example, but both are bool
. Since there's no ambiguity the return type will be bool
:
auto f5 = [](int n) {
if (n == 5) {
return true;
} else {
return false;
}
};
In this case there are two return statements with expressions that evaluate to different types ( bool
and int
). In this case the programmer needs to state state the return type explicitly, because it's not possible for the compiler to decide what the return type should be:
auto f6 = [](int n) -> bool {
if (n < 0) {
return false;
} else {
return n; /* returns true if n > 0 */
}
};
In this case we state the return value to coerce the return type to be a bool
instead of an int
.
auto f7 = [](const std::vector& n) -> bool { return v.size(); };
캡처 목록
The capture list of a lambda is a list of variable names in the enclosing scope of the lambda that have to be captured to become accessible within the lambda.
In a lambda expression the capture list is written between the square brackets []
. It is not optional, and the brackets must be present even if the capture list is empty.
A capture creates a variable within the scope of the lambda body that has the same name as the variable being captured, and which can be (depending on the mode of capture) either a copy of the external variable, or a reference to it.
Captures that generate a copy of the external variable will be called by-copy captures, while the ones that generate a reference to an external variable will be called by-reference captures.
- A by-copy capture creates a variable within the scope of the lambda that is a copy of the variable being captured, with the same value the later had at the time the lambda was created. By-copy captures are read-only, unless the lambda is mutable (more on that later).
- A by-reference capture stores a reference to the variable being captured, which can be used to read or update the value of the external variable at any time during the lifetime of the lambda.
To ground those definitions with an example, take a look at the code fragment below. In it two local variables get defined, followed by the definition of a lambda function that captures them both.
int foo = 33;
int bar = 22;
auto within_range = [foo, &bar](int n) {
return foo + bar;
}
The capture list in the lambda expression captures both variables: foo
is captured by-copy, while bar
is captured by-reference.
It's important to realize that that while the names of the captured variables are the same as the names of the external variables that they mirror, they are different variables. i.e. the foo
variable within the body of the lambda is not the same foo
outside.
As you probably guessed from the example, the capture mode is declared by prefixing by-reference captures with &
. By-copy captures, on the other hand, have no prefix at all.
Additionally, there are two default capture modes that allow us to capture every variable that is used in the body and not explicitly mentioned in the capture list:
- A bare
&
will capture by-reference anything used but not explicitly captured by-copy. - A bare
=
will capture by-copy anything used but not explicitly captured by-reference.
In a typical capture list you'll find a mixture of default capture modes with named captures. There are a few rules to this mix, however.
- The default modes
=
and&
cannot be both present. - If a default mode is present, it must lead the list.
- If a
=
default capture is present, any named capture that follows must be by-reference. - If a
&
default capture is present, any named capture that follows must be by-copy. The rules make sense if you think about it, so it's not really necessary to worry about them too much.
These are some examples of typical capture lists:
- [] Empty capture list, nothing will be captured.
-
[foo] Capture
foo
by copy. -
[&bar] Capture
bar
by reference. -
[foo, &bar] Capture
foo
by-copy andbar
by-reference. - [=] Capture anything named from the enclosing scope by-copy.
- [&] Capture anything named from the enclosing scope by-reference.
-
[&, foo, bar] Capture anything named from the enclosing scope by reference, except
foo
andbar
which must be captured by-copy. -
[=, &foo] Capture anything named from the enclosing scope by copy, except
foo
which must be captured by-reference.
가변 람다
By default by-copy captures are not writable, and therefore the following fragment is an error:
int value;
auto bad_lambda = [value]() { value += 10; };
By-copy captures can be made writable if the lambda is declared as mutable
. This makes the lambda stateful: any change you do to a by-copy capture will be carried over to the next execution of the same lambda.
For example, in this example the lambda will remember any update to the value of the captured initial_value
variable. The value of the external variable, however, will remain unchanged because the lambda updates a copy.
int initial_value{5};
auto counter_lambda = [initial_value]() mutable {
std::cout << initial_value++ << std::endl;
};
// each call will increment the internal copy
// stored within the lambda, and change carry over to
// the next call.
counter_lambda(); // will print 5
counter_lambda(); // will print 6
counter_lambda(); // will print 7
// the original variable outside of the lambda is unchanged
std::cout << initial_value << std::endl;
by-reference captures, on the other hand, can be both read and written regardless of whether the lambda is mutable
or not.
int total{0};
auto accumulate = [&total](int n) { total += n; };
// each call updates the value of the references variable
accumulate(1);
accumulate(2);
accumulate(3);
// print the accumulated value, 6
std::cout << total << std::endl;
일반화된 캡처(C++14 이상)
All the talk so far has been about captures as they were introduced when C++11 came out.
These captures work great, and there's a lot you can do with them, but after a while you'll notice that there are a couple of ways in which they fall short:
- A capture always has the same name as the variable that was captured. This is not a big deal, of course, but sometimes you'd like to be able to name them something else.
- To capture a value it needs to be previously stored in a variable; it's not possible to capture the result of an expression.
- You can't use move semantics with captures. Captured objects need to be copyable; if they are not then you'll need to capture them by-reference, which may create a problem of ownership, or you'll need to do some other trick. This may be particularly annoying if you use
unique_ptr
a lot.
To mend this, C++14 upgraded lambdas with generalized lambda captures
which
- allow you to name the internal name of the capture anything you like.
- allow you to capture not only variables, but also the result of expressions (only by-copy).
- more importantly, allow you to capture move-only variables like
unique_ptr
instances.
The price to pay for these welcome improvements is that generalized captures are bit more verbose than regular captures because you need to state both the name for the variable being captured, and the name of the capture variable created within the lambda. The syntax is:
-
internal_name=expression
for by-copy captures. -
&internal_name=external_name
for by-reference captures.
For example, this example uses generalized captures to capture counter
by reference (naming it cnt
within the lambda), and also captures the result of 3 * mean_level
by copy (naming the result limit
within the lambda).
int mean_level{5};
int counter{0};
auto f = [&cnt = counter, limit = 3 * mean_level]() {
if (cnt < limit) cnt++;
};
To capture a move-only object, you just need to make sure the right side of the by-copy assignment is an rvalue, which can be done by providing a temporary value, or by using std::move
:
auto adapter = std::make_unique<Adapter>();
auto runner = [adapter = std::move(adapter)]() { adapter->run(); }
The extra verbosity of generalized lambda captures is a very small price to pay given that you're not even required to pay for it: you can still use regular C++11 captures when that suits you better, and mix generalized and regular captures to get the best of each:
auto f = [&counter, limit = 3 * mean_level]() {
if (counter < limit) counter++;
};
캡처할 수 있는 것과 캡처할 수 없는 것
Earlier I said that only variables in the immediate local scope of the lambda can be captured. I mentioned it only in passing, and it probably flew under the radar when I said it.
However, this not a minor detail or a technicality, and to see why lets see what cannot be captured.
- Global scope variables and static data members can't be captured.
- Non-static class members can't be captured directly.
The first one may strike you as obvious if you think of it, since globals are accessible from within any function, and lambdas are function-like ("callable") objects, there's no reason for them to be an exception to this.
Still, you should keep in mind that globals must be regarded within a lambda just like by-reference captures are. This may have important implications in multi-threaded programs.
Static class members are just globals in disguise, so it's no surprise that as far as lambdas are concerned, they have the exact same restrictions.
Non-static class members cannot be captured either, but here the truth is more nuanced: actually they can, kind-of, but they cannot be captured in the same sense in which captures work for regular local variables.
Before we dig deeper into this, however, we need to take a short detour to talk about the pointer this
.
캡처 및 this 포인터
During execution each non-static class method has access to an implicitly created this
pointer that references the instance on which the method was called. This is what gives methods access to non-static data members of the class.
Typically you don't need to de-reference that pointer explicitly since the compiler will do it for you, but you can if want to be more explicit. For instance, in this fragment
class Value {
public:
void set(const int x) { x_ = x; }
int get() const { return x_; }
private:
int x_;
};
we can make the dependency on this
more visible by rewriting both methods as
void set(const int x) { this->x_ = x; }
int get() const { return this->x_; }
It's important to notice that this does not change the way the code is compiled, it only makes explicit what the compiler is doing behind your back.
This detour to talk about this
(pun intended) is because now that we have unmasked how access to data members works it's easier to understand the nuanced version of how non-static data member capture works.
Non-static class members cannot be captured because they are not in the local scope around the lambda, they are in the class scope.
However, the this
pointer can be captured, because it is a variable within the immediate scope of lambdas that get created in non-static methods of a class! By capturing this
the lambda gets access to all non-static data members and also to instance methods.
In order to capture this, you just add it to the capture list:
auto is_empty = [this]() { return queue_.empty(); }
It's important to not that this
is not a regular variable, and it's nature imposes a limitation to the capture process: this
can only be captured by-copy; trying to capture the pointer like in [&this]
is syntactic error. Default capture modes will also capture this
, but notice that even if this
is captured by the &
default capture mode, the pointer will still be copied.
Now, here comes the catch: remember that this
is not the instance, this
is a pointer to the instance. You're not capturing the instance by copy, your only copying a pointer to it.
This is really important, because it means that any access to the instance members are still reference-like: by de-referencing this
the lambda is accessing the original external variables, not copies of them!
[this]
as capturing the object by reference. 정확하지는 않지만 충분히 가깝습니다.이제, 이 세부 사항은 특히 다음 예에서와 같이
[=]
이 "모든 것이 복사에 의해 캡처됨"을 의미한다고 생각하는 경우 당신을 물릴 수 있습니다.#include <iostream>
#include <functional>
#include <string>
using namespace std;
using Filter = std::function<bool(const std::string &)>;
class FilterFactory {
public:
Filter buildFilter(const string &name) {
name_ = name;
return [=](const std::string &name) {
return (name == name_);
};
}
private:
std::string name_;
};
int main() {
FilterFactory factory;
auto filter_adam = factory.buildFilter("adam");
auto filter_eva = factory.buildFilter("eva");
cout << filter_adam("adam") << endl; // should have returned true, but returns false
cout << filter_adam("eva") << endl; // should have returned false, but returns true
cout << filter_eva("adam") << endl;
cout << filter_eva("eva") << endl;
}
여기에서 순진한 프로그래머는
[=]
이 클래스 구성원을 포함하여 모든 것을 복사로 캡처할 것으로 예상했을 수 있습니다. 그러면 각 람다가 원래 팩토리 인스턴스와 완전히 독립적이고 독립적으로 만들 수 있습니다.그러나 실제로 기본 캡처 모드는
name_
을 전혀 캡처하지 않습니다. 캡처되는 것은 this
이고 name_
에 대한 액세스는 실제로 this->name_
을 통해 수행되고 있습니다.코드가 빌드되지만 동작은 코더가 예상한 것과 다릅니다. 모든 실용적인 목적을 위해
name_
이 참조로 액세스되어 람다가 생성된 후 수행된 변수 값의 변경 사항을 람다가 "볼"수 있습니다.그리고 상황은 더 나빠집니다.
factory
변수가 람다 전에 파괴되면 람다에 저장된 this
포인터가 무효화되고 포인터가 가리키는 데 사용된 인스턴스의 데이터 멤버에 대한 모든 액세스가 정의되지 않은 동작이 됩니다.클로저 클래스의 직관
Before closing the article I'll talk a bit about the closure class. The idea is not to be rigorous, but just to provide the reader with an intuition of how lambdas get implemented by the compiler under the hood.
The first thing to state is that there's no single closure class. A custom closure class gets created automatically by the compiler for each lambda expression.
These compiler-generated classes cannot be seen or changed, because only the compiler knows what they look like. That's why it is frequently said that the closure objects have an anonymous type.
To take a peek under the hood we can implement our own closure class from the description in the lambda expression. Without loss of generality, lets say we were asked to compile a fragment like the one in the following fragment:
int foo;
bool bar;
auto lf = [foo, &bar](int factor) { return foo * bar * factor; };
From the expression we can deduce that upon construction the lambda needs to capture two variables, one by copy (an integer) and another by reference (a boolean). The lambda objects that get instantiated from the expression need to be callable with a single input parameter (an integer) and must return a value after execution (another integer).
A possible implementation of the closure type for the lambda in the example above would be the following one:
class ClosureType {
public:
ClosureType(int foo, double &bar) : foo_{foo}, bar_{bar} {}
double operator()(int factor) const {
return foo_ * bar_ * factor;
}
private:
int foo_;
double &bar_;
};
Where you can see that:
- Captures become closure class constructor parameters.
- By-copy captures like
foo
are stored within member variables in each instance. - By-reference captures like
&bar
, are not stored themselves, but a reference to them is stored in the class. - By overloading
operator()
the closure objects produced by the class become callable objects. - The code body, return type and parameter list of the lambda expression become the body, return type and parameter list of the
operator()
overload. - In this case the
operator()
overload is aconst
method because the lambda is notmutable
.
In the closure example above it is readily visible that captured variables get read on construction, while lambda parameters get passed to the lambda on execution.
While not very rigorous, our example above is good enough to get an idea of what a closure class may look like and how each part of the expression affects the implementation of the lambda.
The closure class example shows how the number and mode of captures impacts on the size of the closure objects that get instantiated from it: lambdas are more than just code (like a function would be), they carry a context with them, and the size of the context depends on the size and type of the captures.
By-reference captures are light, and only add a pointer to the size of the lambda object, but they don't guarantee that the captured object will exist at least as long as the lambda exists. By-copy captures, on the other hand, do guarantee the lifetime, but they can be heavy to move around because of the copy operation. Moving around lambdas will be at least as expensive as the most expensive by-copy capture they own.
In reality compilers can use a variety of implementations for lambdas depending on what's better suited for each case in particular. In particular, if a lambda does not capture any variable then it's usually cheaper to implement it with a simple anonymous function instead of an anonymous class; closure objects in this case become just pointers to that anonymous function.
That's why no-capture lambdas can be assigned to variables of correct pointer-to-function type variable, but lambdas that capture variables cannot.
int factor = 2;
auto no_capture_lambda = [](int n) { return 2 * n; };
auto with_capture_lambda = [factor](int n) { return factor * n; };
int (*f_ptr_1)(int) = no_capture_lambda; // this is ok
int (*f_ptr_2)(int) = with_capture_lambda; // this fails to build
결론
There's more to say about lambdas, of course, but this is probably enough for a good first bite.
For a deeper coverage you can visit the somewhat terse but extremely complete cppreference.com page on the topic . 거기에서 C++17 이상에서 람다 함수의 새로운 개발에 대한 많은 정보도 찾을 수 있습니다. 이는 제가 의도적으로 생략한 것입니다.Effective Modern C++ 의 사본에 액세스할 수 있는 경우 읽으십시오. 그렇지 않다면 가서 받으십시오. 람다에 대한 장은 책의 나머지 부분과 마찬가지로 매우 잘 쓰여지고 유익합니다. 컴파일러에서 일반화된 캡처를 사용할 수 없을 때 람다와 함께 이동 의미 체계를 사용하는 깔끔한 트릭을 찾을 수 있습니다.
나는 람다를 전달하는 데 많은 것을 얻지는 못했지만 의도적으로든 실수로든 참조로 항목을 캡처하여 발생하는 문제에 중점을 두려고 노력했습니다. 람다가 생성된 컨텍스트와 다른 컨텍스트에서 람다를 안전하게 실행하고 다중 스레드 코드에서 다른 객체와 상호 작용하는 방식을 결정하려면 이를 이해하는 것이 매우 중요합니다.
다음 시간까지 이것을 즐겼기를 바랍니다!
Reference
이 문제에 관하여(C++ 람다, 초보자용), 우리는 이곳에서 더 많은 자료를 발견하고 링크를 클릭하여 보았다 https://dev.to/glpuga/c-lambdas-for-beginners-313c텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.
우수한 개발자 콘텐츠 발견에 전념 (Collection and Share based on the CC Protocol.)